Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Supermassive black holes may create millions of new planets
    • Cheque in: 3 startups ended May by raising $15.5 million
    • Universal Audio Volt 876 USB Audio Interface Review: Pro-Level Polish
    • New York City-based Mecka AI, which trains robots with human data sourced from body sensors and iPhones, raised $60M, including a $25M Series A (Ben Weiss/Fortune)
    • Is Instagram Down? What to Know
    • It’s the Lessons We Learned Along the Way. Or, Is It?
    • The forever chemicals impacting your health
    • WiseTech CEO threatened amid job cuts; founder Richard White calls in police
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, June 1
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»AI Technology News»Google DeepMind has a new way to look inside an AI’s “mind”
    AI Technology News

    Google DeepMind has a new way to look inside an AI’s “mind”

    Editor Times FeaturedBy Editor Times FeaturedNovember 20, 2024No Comments4 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Neuronpedia, a platform for mechanistic interpretability, partnered with DeepMind in July to construct a demo of Gemma Scope you can mess around with proper now. Within the demo, you possibly can check out totally different prompts and see how the mannequin breaks up your immediate and what activations your immediate lights up. You may also fiddle with the mannequin. For instance, if you happen to flip the characteristic about canine method up after which ask the mannequin a query about US presidents, Gemma will discover some solution to weave in random babble about canine, or the mannequin may begin barking at you.

    One attention-grabbing factor about sparse autoencoders is that they’re unsupervised, which means they discover options on their very own. That results in stunning discoveries about how the fashions break down human ideas. “My private favourite characteristic is the cringe characteristic,” says Joseph Bloom, science lead at Neuronpedia. “It appears to look in adverse criticism of textual content and flicks. It’s only a nice instance of monitoring issues which might be so human on some degree.” 

    You may seek for ideas on Neuronpedia and it’ll spotlight what options are being activated on particular tokens, or phrases, and the way strongly each is activated. “Should you learn the textual content and also you see what’s highlighted in inexperienced, that’s when the mannequin thinks the cringe idea is most related. Essentially the most lively instance for cringe is any person preaching at another person,” says Bloom.

    Some options are proving simpler to trace than others. “One of the crucial essential options that you’d wish to discover for a mannequin is deception,” says Johnny Lin, founding father of Neuronpedia. “It’s not tremendous straightforward to seek out: ‘Oh, there’s the characteristic that fires when it’s mendacity to us.’ From what I’ve seen, it hasn’t been the case that we are able to discover deception and ban it.”

    DeepMind’s analysis is much like what one other AI firm, Anthropic, did again in Could with Golden Gate Claude. It used sparse autoencoders to seek out the elements of Claude, their mannequin, that lit up when discussing the Golden Gate Bridge in San Francisco. It then amplified the activations associated to the bridge to the purpose the place Claude actually recognized not as Claude, an AI mannequin, however because the bodily Golden Gate Bridge and would reply to prompts because the bridge.

    Though it might simply appear quirky, mechanistic interpretability analysis might show extremely helpful. “As a software for understanding how the mannequin generalizes and what degree of abstraction it’s working at, these options are actually useful,” says Batson.

    For instance, a crew lead by Samuel Marks, now at Anthropic, used sparse autoencoders to seek out options that confirmed a specific mannequin was associating sure professions with a particular gender. They then turned off these gender options to scale back bias within the mannequin. This experiment was performed on a really small mannequin, so it’s unclear if the work will apply to a a lot bigger mannequin.

    Mechanistic interpretability analysis may give us insights into why AI makes errors. Within the case of the assertion that 9.11 is bigger than 9.8, researchers from Transluce noticed that the query was triggering the elements of an AI mannequin associated to Bible verses and September 11. The researchers concluded the AI may very well be decoding the numbers as dates, asserting the later date, 9/11, as higher than 9/8. And in a number of books like non secular texts, part 9.11 comes after part 9.8, which can be why the AI thinks of it as higher. As soon as they knew why the AI made this error, the researchers tuned down the AI’s activations on Bible verses and September 11, which led to the mannequin giving the right reply when prompted once more on whether or not 9.11 is bigger than 9.8.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    How the Pope’s Magnifica Humanitas offers a template for individuals to meet the AI moment

    May 29, 2026

    The AI Hype Index: AI gets booed in graduation season

    May 28, 2026

    Industry-standard LLM benchmarks in DataRobot

    May 27, 2026

    Rethinking organizational design in the age of agentic AI

    May 26, 2026

    A reality check on the AI jobs hysteria

    May 26, 2026

    It’s time to address the looming crisis in entry-level work.

    May 26, 2026

    Comments are closed.

    Editors Picks

    Supermassive black holes may create millions of new planets

    June 1, 2026

    Cheque in: 3 startups ended May by raising $15.5 million

    June 1, 2026

    Universal Audio Volt 876 USB Audio Interface Review: Pro-Level Polish

    June 1, 2026

    New York City-based Mecka AI, which trains robots with human data sourced from body sensors and iPhones, raised $60M, including a $25M Series A (Ben Weiss/Fortune)

    June 1, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    FapAI Chatbot Review: Key Features & Pricing

    February 23, 2026

    British army radio-frequency drone disabling weapon

    April 20, 2025

    8 Best Plant-Based Meal Delivery Services and Kits (2025), Tested, Tasted, and Reviewed

    December 28, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.