Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Austin-based PE firm Haveli Investments agrees to acquire Couchbase, which offers a cloud database for AI applications, for about $1.5B (Akash Sriram/Reuters)
    • Max: The 28 Absolute Best TV Shows to Watch
    • Coffee in midlife linked to healthier aging in women
    • Best Nintendo Switch 2 Accessories: Controllers, Cases, and More
    • A look at the US Army Reserve’s Detachment 201, which lets tech executives, like Meta’s Bosworth, give advice as senior officers while keeping their day jobs (Steven Levy/Wired)
    • Heat Got You Dragging? These Simple Tricks Fight Fatigue Fast
    • Genesis G80 and G90 sedans offer luxury and safety
    • Gear News This Week: Adobe Wants to Make iPhone Photos Better, and TCL Brings Flexibility to Atmos
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Sunday, June 22
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»News»With the launch of o3-pro, let’s talk about what AI “reasoning” actually does
    News

    With the launch of o3-pro, let’s talk about what AI “reasoning” actually does

    Editor Times FeaturedBy Editor Times FeaturedJune 11, 2025No Comments2 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Why use o3-pro?

    In contrast to general-purpose fashions like GPT-4o that prioritize pace, broad data, and making customers feel good about themselves, o3-pro makes use of a chain-of-thought simulated reasoning course of to dedicate extra output tokens towards working by complicated issues, making it usually higher for technical challenges that require deeper evaluation. Nevertheless it’s nonetheless not excellent.

    An OpenAI’s o3-pro benchmark chart.


    Credit score:

    OpenAI


    Measuring so-called “reasoning” functionality is hard since benchmarks could be straightforward to recreation by cherry-picking or coaching knowledge contamination, however OpenAI studies that o3-pro is in style amongst testers, no less than. “In knowledgeable evaluations, reviewers constantly want o3-pro over o3 in each examined class and particularly in key domains like science, training, programming, enterprise, and writing assist,” writes OpenAI in its launch notes. “Reviewers additionally rated o3-pro constantly greater for readability, comprehensiveness, instruction-following, and accuracy.”

    An OpenAI's o3-pro benchmark chart.
    An OpenAI’s o3-pro benchmark chart.


    Credit score:

    OpenAI


    OpenAI shared benchmark outcomes displaying o3-pro’s reported efficiency enhancements. On the AIME 2024 arithmetic competitors, o3-pro achieved 93 % cross@1 accuracy, in comparison with 90 % for o3 (medium) and 86 % for o1-pro. The mannequin reached 84 % on PhD-level science questions from GPQA Diamond, up from 81 % for o3 (medium) and 79 % for o1-pro. For programming duties measured by Codeforces, o3-pro achieved an Elo score of 2748, surpassing o3 (medium) at 2517 and o1-pro at 1707.

    When reasoning is simulated

    Structure made of cubes in the shape of a thinking or contemplating person that evolves from simple to complex, 3D render.


    Credit score:

    Floriana via Getty Images


    It is simple for laypeople to be thrown off by the anthropomorphic claims of “reasoning” in AI fashions. On this case, as with the borrowed anthropomorphic time period “hallucinations,” “reasoning” has grow to be a time period of artwork within the AI trade that mainly means “devoting extra compute time to fixing an issue.” It doesn’t essentially imply the AI fashions systematically apply logic or possess the flexibility to assemble options to actually novel issues. For this reason Ars Technica continues to make use of the time period “simulated reasoning” (SR) to explain these fashions. They’re simulating a human-style reasoning course of that doesn’t essentially produce the identical outcomes as human reasoning when confronted with novel challenges.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Austin-based PE firm Haveli Investments agrees to acquire Couchbase, which offers a cloud database for AI applications, for about $1.5B (Akash Sriram/Reuters)

    June 22, 2025

    A look at the US Army Reserve’s Detachment 201, which lets tech executives, like Meta’s Bosworth, give advice as senior officers while keeping their day jobs (Steven Levy/Wired)

    June 22, 2025

    An interview with LinkedIn CEO Ryan Roslansky on how AI is affecting the jobs market, AI agents, dealing with fake AI-generated accounts, and more (Shirin Ghaffary/Bloomberg)

    June 22, 2025

    Experts say China’s new national internet ID, currently voluntary, to sign in across social media apps and sites could further erode already limited freedoms (John Liu/CNN)

    June 22, 2025

    An interview with computational linguist Emily Bender, who coined the term “stochastic parrot”, on her AI skepticism, co-writing the book The AI Con, and more (George Hammond/Financial Times)

    June 21, 2025

    the WH rejected DOD’s proposal for the head of NSA and US Cyber Command, extending the agencies’ leadership vacuum; Trump fired NSA’s head in April (John Sakellariadis/Politico)

    June 21, 2025
    Leave A Reply Cancel Reply

    Editors Picks

    Austin-based PE firm Haveli Investments agrees to acquire Couchbase, which offers a cloud database for AI applications, for about $1.5B (Akash Sriram/Reuters)

    June 22, 2025

    Max: The 28 Absolute Best TV Shows to Watch

    June 22, 2025

    Coffee in midlife linked to healthier aging in women

    June 22, 2025

    Best Nintendo Switch 2 Accessories: Controllers, Cases, and More

    June 22, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    HeraHaven Review and Features- What to Know?

    February 18, 2025

    Best Internet Providers in Florida

    April 8, 2025

    Keep Your iPhone’s Notes App Organized With This Hidden Trick

    February 20, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.