Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Understanding Random Forest using Python (scikit-learn)
    • CRISPR-Cas9 enables red fluorescent silk in genetically modified spiders
    • How Europe views AI: Insights from our polls and expert reactions
    • Is She Really Mad at Me? Maybe ChatGPT Knows
    • Signal clone used by Trump official stops operations after report it was hacked
    • 2 in 5 Cars Sold Worldwide Will Be EVs by 2030. US Drivers, You’ve Got Some Catching Up to Do
    • Why we need ‘revolutionary’ cooling tech
    • Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, May 20
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»News»New secret math benchmark stumps AI models and PhDs alike
    News

    New secret math benchmark stumps AI models and PhDs alike

    Editor Times FeaturedBy Editor Times FeaturedNovember 13, 2024No Comments2 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Epoch AI allowed Fields Medal winners Terence Tao and Timothy Gowers to evaluate parts of the benchmark. “These are extraordinarily difficult,” Tao mentioned in suggestions offered to Epoch. “I feel that within the close to time period principally the one option to resolve them, in need of having an actual area skilled within the space, is by a mix of a semi-expert like a graduate pupil in a associated subject, possibly paired with some mixture of a contemporary AI and plenty of different algebra packages.”

    A chart displaying AI fashions’ restricted success on the FrontierMath issues, taken from Epoch AI’s analysis paper.


    Credit score:

    Epoch AI

    To help within the verification of right solutions throughout testing, the FrontierMath issues should have solutions that may be robotically checked by way of computation, both as precise integers or mathematical objects. The designers made issues “guessproof” by requiring giant numerical solutions or complicated mathematical options, with lower than a 1 p.c likelihood of right random guesses.

    Mathematician Evan Chen, writing on his blog, defined how he thinks that FrontierMath differs from conventional math competitions just like the International Mathematical Olympiad (IMO). Issues in that competitors usually require artistic perception whereas avoiding complicated implementation and specialised information, he says. However for FrontierMath, “they preserve the primary requirement, however outright invert the second and third requirement,” Chen wrote.

    Whereas IMO issues keep away from specialised information and complicated calculations, FrontierMath embraces them. “As a result of an AI system has vastly better computational energy, it is truly potential to design issues with simply verifiable options utilizing the identical concept that IOI or Undertaking Euler does—principally, ‘write a proof’ is changed by ‘implement an algorithm in code,'” Chen defined.

    The group plans common evaluations of AI fashions in opposition to the benchmark whereas increasing its drawback set. They are saying they are going to launch further pattern issues within the coming months to assist the analysis neighborhood check their methods.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Signal clone used by Trump official stops operations after report it was hacked

    May 20, 2025

    Man pleads guilty to using malicious AI software to hack Disney employee

    May 19, 2025

    Trump’s attacks on green energy are big trouble for data centers, AI

    May 19, 2025

    Jury orders NSO to pay $167 million for hacking WhatsApp users

    May 19, 2025

    VMware perpetual license holders receive cease-and-desist letters from Broadcom

    May 19, 2025

    WhatsApp provides no cryptographic management for group messages

    May 19, 2025

    Comments are closed.

    Editors Picks

    Understanding Random Forest using Python (scikit-learn)

    May 20, 2025

    CRISPR-Cas9 enables red fluorescent silk in genetically modified spiders

    May 20, 2025

    How Europe views AI: Insights from our polls and expert reactions

    May 20, 2025

    Is She Really Mad at Me? Maybe ChatGPT Knows

    May 20, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Pokémon Cards Are Back—No Binders Needed

    November 20, 2024

    Best Pillows for Back Sleepers in 2025

    February 19, 2025

    It’s Like Virtual Reality Goggles for Your Mouth

    March 7, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.