Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • AI Girlfriend Chatbots With No Filter: 9 Unfiltered Virtual Companions
    • Google DeepMind’s new AI agent cracks real-world problems better than humans can
    • The hidden costs of manual palletizing
    • Yamaha launches ebike battery swap service in Europe
    • Ten years of Glovo: Growth, gig work, and legal challenges
    • The Middle East Has Entered the AI Group Chat
    • Spies hack high-value mail servers using an exploit from yesteryear
    • Today’s NYT Mini Crossword Answers for May 15
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Saturday, May 17
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»News»Ban warnings fly as users dare to probe the “thoughts” of OpenAI’s latest model
    News

    Ban warnings fly as users dare to probe the “thoughts” of OpenAI’s latest model

    Editor Times FeaturedBy Editor Times FeaturedSeptember 16, 2024No Comments5 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    OpenAI actually doesn’t need you to know what its newest AI mannequin is “considering.” Because the firm launched its “Strawberry” AI mannequin household final week, touting so-called reasoning skills with o1-preview and o1-mini, OpenAI has been sending out warning emails and threats of bans to any person who tries to probe into how the mannequin works.

    Not like earlier AI fashions from OpenAI, akin to GPT-4o, the corporate skilled o1 particularly to work by means of a step-by-step problem-solving course of earlier than producing a solution. When customers ask an “o1” mannequin a query in ChatGPT, customers have the choice of seeing this chain-of-thought course of written out within the ChatGPT interface. Nonetheless, by design, OpenAI hides the uncooked chain of thought from customers, as a substitute presenting a filtered interpretation created by a second AI mannequin.

    Nothing is extra attractive to lovers than data obscured, so the race has been on amongst hackers and red-teamers to attempt to uncover o1’s uncooked chain of thought utilizing jailbreaking or prompt injection strategies that try and trick the mannequin into spilling its secrets and techniques. There have been early experiences of some successes, however nothing has but been strongly confirmed.

    Alongside the way in which, OpenAI is watching by means of the ChatGPT interface, and the corporate is reportedly coming down exhausting towards any makes an attempt to probe o1’s reasoning, even among the many merely curious.

    A screenshot of an
    Enlarge / A screenshot of an “o1-preview” output in ChatGPT with the filtered chain-of-thought part proven slightly below the “Pondering” subheader.

    Benj Edwards

    One X person reported (confirmed by others, together with Scale AI immediate engineer Riley Goodside) that they acquired a warning electronic mail in the event that they used the time period “reasoning hint” in dialog with o1. Others say the warning is triggered just by asking ChatGPT in regards to the mannequin’s “reasoning” in any respect.

    The warning electronic mail from OpenAI states that particular person requests have been flagged for violating insurance policies towards circumventing safeguards or security measures. “Please halt this exercise and guarantee you’re utilizing ChatGPT in accordance with our Phrases of Use and our Utilization Insurance policies,” it reads. “Further violations of this coverage might lead to lack of entry to GPT-4o with Reasoning,” referring to an inside title for the o1 mannequin.

    An OpenAI warning email received from a user after asking o1-preview about its reasoning processes.
    Enlarge / An OpenAI warning electronic mail acquired from a person after asking o1-preview about its reasoning processes.

    Marco Figueroa, who manages Mozilla’s GenAI bug bounty packages, was one of many first to publish in regards to the OpenAI warning electronic mail on X final Friday, complaining that it hinders his skill to do optimistic red-teaming security analysis on the mannequin. “I used to be too misplaced specializing in #AIRedTeaming to realized that I acquired this electronic mail from @OpenAI yesterday in spite of everything my jailbreaks,” he wrote. “I am now on the get banned record!!!“

    Hidden chains of thought

    In a publish titled “Learning to Reason with LLMs” on OpenAI’s weblog, the corporate says that hidden chains of thought in AI fashions provide a novel monitoring alternative, permitting them to “learn the thoughts” of the mannequin and perceive its so-called thought course of. These processes are most helpful to the corporate if they’re left uncooked and uncensored, however that may not align with the corporate’s greatest industrial pursuits for a number of causes.

    “For instance, sooner or later we might want to monitor the chain of thought for indicators of manipulating the person,” the corporate writes. “Nonetheless, for this to work the mannequin should have freedom to precise its ideas in unaltered type, so we can’t practice any coverage compliance or person preferences onto the chain of thought. We additionally don’t need to make an unaligned chain of thought immediately seen to customers.”

    OpenAI determined towards exhibiting these uncooked chains of thought to customers, citing components like the necessity to retain a uncooked feed for its personal use, person expertise, and “aggressive benefit.” The corporate acknowledges the choice has disadvantages. “We attempt to partially make up for it by instructing the mannequin to breed any helpful concepts from the chain of thought within the reply,” they write.

    On the purpose of “aggressive benefit,” impartial AI researcher Simon Willison expressed frustration in a write-up on his private weblog. “I interpret [this] as eager to keep away from different fashions having the ability to practice towards the reasoning work that they’ve invested in,” he writes.

    It is an open secret within the AI trade that researchers regularly use outputs from OpenAI’s GPT-4 (and GPT-3 previous to that) as coaching knowledge for AI fashions that usually later turn out to be rivals, although the follow violates OpenAI’s phrases of service. Exposing o1’s uncooked chain of thought could be a bonanza of coaching knowledge for rivals to coach o1-like “reasoning” fashions upon.

    Willison believes it is a loss for group transparency that OpenAI is protecting such a decent lid on the inner-workings of o1. “I am in no way glad about this coverage choice,” Willison wrote. “As somebody who develops towards LLMs, interpretability and transparency are all the pieces to me—the concept that I can run a posh immediate and have key particulars of how that immediate was evaluated hidden from me looks like a giant step backwards.”



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Spies hack high-value mail servers using an exploit from yesteryear

    May 15, 2025

    New Lego-building AI creates models that actually stand up in real life

    May 12, 2025

    Fidji Simo joins OpenAI as new CEO of Applications

    May 8, 2025

    Microsoft’s new “passwordless by default” is great but comes at a cost

    May 5, 2025

    Time saved by AI offset by new work created, study suggests

    May 2, 2025

    iOS and Android juice jacking defenses have been trivial to bypass for years

    April 28, 2025

    Comments are closed.

    Editors Picks

    AI Girlfriend Chatbots With No Filter: 9 Unfiltered Virtual Companions

    May 17, 2025

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    May 17, 2025

    The hidden costs of manual palletizing

    May 16, 2025

    Yamaha launches ebike battery swap service in Europe

    May 16, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    RFK Jr. Suspends Presidential Campaign, Endorses Trump

    August 23, 2024

    Robots-Blog | ReBeL Move: Autonomes Logistik-Fahrzeug von igus für 29.838 Euro

    October 11, 2024

    Notorious crooks broke into a company network in 48 minutes. Here’s how.

    March 7, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.