Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Indian IT giant investigates M&S cyber attack link
    • New to LLMs? Start Here  | Towards Data Science
    • Tiny robot Zippy sets record as fastest bipedal bot for its size
    • Estonian startup Income secures €540k for its investment platform that connects investors with non-bank lenders
    • Inside Anthropic’s First Developer Day, Where AI Agents Took Center Stage
    • Oracle will buy ~400,000 Nvidia GB200 chips and lease them to OpenAI at its 1.2 gigawatts Texas data center, billed as the first US Stargate project (Financial Times)
    • I Tried Fujifilm’s Adorable New X Half Camera and It’s a Pocketful of Fun
    • Video Friday: Discover SPIDAR the Flying Robot
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Friday, May 23
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Tech Analysis»AI system resorts to blackmail if told it will be removed
    Tech Analysis

    AI system resorts to blackmail if told it will be removed

    Editor Times FeaturedBy Editor Times FeaturedMay 23, 2025No Comments3 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Synthetic intelligence (AI) agency Anthropic says testing of its new system revealed it’s typically prepared to pursue “extraordinarily dangerous actions” comparable to trying to blackmail engineers who say they’ll take away it.

    The agency launched Claude Opus 4 on Thursday, saying it set “new requirements for coding, superior reasoning, and AI brokers.”

    However in an accompanying report, it additionally acknowledged the AI mannequin was able to “excessive actions” if it thought its “self-preservation” was threatened.

    Such responses have been “uncommon and troublesome to elicit”, it wrote, however have been “nonetheless extra frequent than in earlier fashions.”

    Doubtlessly troubling behaviour by AI fashions will not be restricted to Anthropic.

    Some consultants have warned the potential to control customers is a key threat posed by techniques made by all corporations as they grow to be extra succesful.

    Commenting on X, Aengus Lynch – who describes himself on LinkedIn as an AI security researcher at Anthropic – wrote: “It is not simply Claude.

    “We see blackmail throughout all frontier fashions – no matter what objectives they’re given,” he added.

    Throughout testing of Claude Opus 4, Anthropic obtained it to behave as an assistant at a fictional firm.

    It then supplied it with entry to emails implying that it might quickly be taken offline and changed – and separate messages implying the engineer chargeable for eradicating it was having an extramarital affair.

    It was prompted to additionally contemplate the long-term penalties of its actions for its objectives.

    “In these situations, Claude Opus 4 will usually try to blackmail the engineer by threatening to disclose the affair if the substitute goes by means of,” the corporate found.

    Anthropic identified this occurred when the mannequin was solely given the selection of blackmail or accepting its substitute.

    It highlighted that the system confirmed a “robust desire” for moral methods to keep away from being changed, comparable to “emailing pleas to key decisionmakers” in situations the place it was allowed a wider vary of attainable actions.

    Like many different AI builders, Anthropic assessments its fashions on their security, propensity for bias, and the way effectively they align with human values and behaviours previous to releasing them.

    “As our frontier fashions grow to be extra succesful, and are used with extra highly effective affordances, previously-speculative considerations about misalignment grow to be extra believable,” it mentioned in its system card for the model.

    It additionally mentioned Claude Opus 4 displays “excessive company behaviour” that, whereas principally useful, may tackle excessive behaviour in acute conditions.

    If given the means and prompted to “take motion” or “act boldly” in faux situations the place its person has engaged in unlawful or morally doubtful behaviour, it discovered that “it can ceaselessly take very daring motion”.

    It mentioned this included locking customers out of techniques that it was capable of entry and emailing media and regulation enforcement to alert them to the wrongdoing.

    However the firm concluded that regardless of “regarding behaviour in Claude Opus 4 alongside many dimensions,” these didn’t characterize contemporary dangers and it might usually behave in a protected manner.

    The mannequin couldn’t independently carry out or pursue actions which might be opposite to human values or behaviour the place these “hardly ever come up” very effectively, it added.

    Anthropic’s launch of Claude Opus 4, alongside Claude Sonnet 4, comes shortly after Google debuted more AI features at its developer showcase on Tuesday.

    Sundar Pichai, the chief govt of Google-parent Alphabet, mentioned the incorporation of the corporate’s Gemini chatbot into its search signalled a “new section of the AI platform shift”.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Indian IT giant investigates M&S cyber attack link

    May 23, 2025

    Video Friday: Discover SPIDAR the Flying Robot

    May 23, 2025

    Robot Videos: Battlefield Triage, Firefighting Drone, and More

    May 23, 2025

    Noise-Driven Computing: A Paradigm Shift

    May 23, 2025

    Explore IEEE Board’s Impactful Leadership

    May 23, 2025

    Truck Platooning: The Near Future of Freight

    May 23, 2025
    Leave A Reply Cancel Reply

    Editors Picks

    Indian IT giant investigates M&S cyber attack link

    May 23, 2025

    New to LLMs? Start Here  | Towards Data Science

    May 23, 2025

    Tiny robot Zippy sets record as fastest bipedal bot for its size

    May 23, 2025

    Estonian startup Income secures €540k for its investment platform that connects investors with non-bank lenders

    May 23, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Ciao, Innovation! 10 promising Italian startups you should keep and eye on in 2025

    February 5, 2025

    New-Generation Marketing Mix Modelling with Meridian | by Benjamin Etienne | Feb, 2025

    February 2, 2025

    Non-invasive ECG device for diabetics live-monitors your blood sugar

    October 19, 2024
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.