Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • High-Endurance ASW and Strike USV
    • The competition watchdog just got a seat at the table in the legal battle between Epic Games and Apple
    • War Memes Are Turning Conflict Into Content
    • OnePlus Reveals New Phones Despite Uncertain Future
    • KTM Freeride E now street legal in all 50 US states
    • Unknown knowns: Techboard dug up unannounced startup investments – and discovered it’s potentially the majority of funding
    • Ben McKenzie Says Crypto Has a Secret Ingredient: Male Loneliness
    • Today’s NYT Mini Crossword Answers for April 21
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, April 21
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Tech Analysis»DeepMind Table Tennis Robots Train Each Other
    Tech Analysis

    DeepMind Table Tennis Robots Train Each Other

    Editor Times FeaturedBy Editor Times FeaturedJuly 22, 2025No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Hardly a day goes by with out spectacular new robotic platforms rising from tutorial labs and business startups worldwide. Humanoid robots specifically look more and more able to helping us in factories and ultimately in properties and hospitals. But, for these machines to be actually helpful, they want refined “brains” to regulate their robotic our bodies. Historically, programming robots includes specialists spending numerous hours meticulously scripting complicated behaviors and exhaustively tuning parameters, reminiscent of controller positive aspects or motion-planning weights, to attain desired efficiency. Whereas machine learning (ML) strategies have promise, robots that have to study new complicated behaviors nonetheless require substantial human oversight and reengineering. At Google DeepMind, we requested ourselves: How will we allow robots to study and adapt extra holistically and constantly, decreasing the bottleneck of professional intervention for each important enchancment or new ability?

    This query has been a driving power behind our robotics analysis. We’re exploring paradigms the place two robotic brokers enjoying towards one another can obtain a larger diploma of autonomous self-improvement, shifting past methods which are merely preprogrammed with fastened or narrowly adaptive ML fashions towards brokers that may study a broad vary of abilities on the job. Constructing on our earlier work in ML with methods like AlphaGo and AlphaFold, we turned our consideration to the demanding sport of table tennis as a testbed.

    We selected desk tennis exactly as a result of it encapsulates most of the hardest challenges in robotics inside a constrained, but extremely dynamic, setting. Desk tennis requires a robotic to grasp a confluence of inauspicious abilities: Past simply notion, it calls for exceptionally exact management to intercept the ball on the appropriate angle and velocity and includes strategic decision-making to outmaneuver an opponent. These parts make it a super area for creating and evaluating strong studying algorithms that may deal with real-time interplay, complicated physics, high-level reasoning and the necessity for adaptive methods—capabilities which are instantly transferable to purposes like manufacturing and even probably unstructured dwelling settings.

    The Self-Enchancment Problem

    Normal machine studying approaches typically fall brief in the case of enabling steady, autonomous studying. Imitation studying, the place a robotic learns by mimicking an professional, usually requires us to supply huge numbers of human demonstrations for each ability or variation; this reliance on professional data collection turns into a big bottleneck if we would like the robotic to repeatedly study new duties or refine its efficiency over time. Equally, reinforcement learning, which trains brokers by means of trial-and-error guided by rewards or punishments, typically necessitates that human designers meticulously engineer complicated mathematical reward capabilities to exactly seize desired behaviors for multifaceted duties, after which adapt them because the robotic wants to enhance or study new abilities, limiting scalability. In essence, each of those well-established strategies historically contain substantial human involvement, particularly if the aim is for the robotic to repeatedly self-improve past its preliminary programming. Subsequently, we posed a direct problem to our group: Can robots study and improve their abilities with minimal or no human intervention throughout the learning-and-improvement loop?

    Studying Via Competitors: Robotic vs. Robotic

    One revolutionary strategy we explored mirrors the technique used for AlphaGo: Have brokers study by competing towards themselves. We experimented with having two robot arms play desk tennis towards one another, an concept that’s easy but highly effective. As one robotic discovers a greater technique, its opponent is pressured to adapt and enhance, making a cycle of escalating ability ranges.

       DeepMind  

    To allow the in depth coaching wanted for these paradigms, we engineered a totally autonomous table-tennis setting. This setup allowed for steady operation, that includes automated ball assortment in addition to remote monitoring and management, permitting us to run experiments for prolonged durations with out direct involvement. As a primary step, we efficiently educated a robotic agent (replicated on each the robots independently) utilizing reinforcement studying in simulation to play cooperative rallies. We fine-tuned the agent for a number of hours within the real-world robot-versus-robot setup, leading to a coverage able to holding lengthy rallies. We then switched to tackling the aggressive robot-versus-robot play.

    Out of the field, the cooperative agent didn’t work effectively in aggressive play. This was anticipated, as a result of in cooperative play, rallies would settle right into a slim zone, limiting the distribution of balls the agent can hit again. Our speculation was that if we continued coaching with aggressive play, this distribution would slowly broaden as we rewarded every robotic for beating its opponent. Whereas promising, coaching methods by means of aggressive self-play in the true world offered important hurdles. The rise in distribution turned out to be somewhat drastic given the constraints of the restricted mannequin dimension. Basically, it was laborious for the mannequin to study to cope with the brand new pictures successfully with out forgetting outdated pictures, and we rapidly hit a local-minima within the coaching the place after a brief rally, one robotic would hit a straightforward winner, and the second robotic was not in a position to return it.

    Whereas robot-on-robot aggressive play has remained a tricky nut to crack, our group additionally investigated how the robot could play against humans competitively. Within the early phases of coaching, people did a greater job of preserving the ball in play, thus growing the distribution of pictures that the robotic may study from. We nonetheless needed to develop a coverage structure consisting of low-level controllers with their detailed ability descriptors and a high-level controller that chooses the low-level abilities, together with strategies for enabling a zero-shot sim-to-real strategy to permit our system to adapt to unseen opponents in actual time. In a consumer examine, whereas the robotic misplaced all of its matches towards essentially the most superior gamers, it gained all of its matches towards freshmen and about half of its matches towards intermediate gamers, demonstrating solidly newbie human-level efficiency. Geared up with these improvements, plus a greater start line than cooperative play, we’re in an amazing place to return to robot-versus-robot aggressive coaching and proceed scaling quickly.

     DeepMind

    The AI Coach: VLMs Enter the Sport

    A second intriguing concept we investigated leverages the facility of vision language models (VLMs), like Gemini. May a VLM act as a coach, observing a robotic participant and offering steerage for enchancment?

      DeepMind

    An vital perception of this mission is that VLMs might be leveraged for explainable robotic coverage search. Based mostly on this perception, we developed the SAS Prompt (summarize, analyze, synthesize), a single immediate that allows iterative studying and adaptation of robotic habits by leveraging the VLM’s means to retrieve, purpose, and optimize to synthesize new habits. Our strategy might be considered an early instance of a brand new household of explainable policy-search strategies which are totally applied inside an LLM. Additionally, there isn’t any reward perform—the VLM infers the reward instantly from the observations given within the job description. The VLM can thus turn into a coach that continuously analyzes the efficiency of the scholar and supplies options for get higher.

     AI robot practicing ping pong with specific ball placements on a blue table. DeepMind

    Towards Really Discovered Robotics: An Optimistic Outlook

    Shifting past the constraints of conventional programming and ML strategies is important for the way forward for robotics. Strategies enabling autonomous self-improvement, like these we’re creating, scale back the reliance on painstaking human effort. Our table-tennis tasks discover pathways towards robots that may purchase and refine complicated abilities extra autonomously. Whereas important challenges persist—stabilizing robot-versus-robot studying and scaling VLM-based teaching are formidable duties—these approaches provide a singular alternative. We’re optimistic that continued analysis on this route will result in extra succesful, adaptable machines that may study the various abilities wanted to function successfully and safely in our unstructured world. The journey is complicated, however the potential payoff of actually clever and useful robotic companions make it value pursuing.

    The authors categorical their deepest appreciation to the Google DeepMind Robotics group and specifically David B. D’Ambrosio, Saminda Abeyruwan, Laura Graesser, Atil Iscen, Alex Bewley, and Krista Reymann for his or her invaluable contributions to the event and refinement of this work.

    From Your Website Articles

    Associated Articles Across the Net



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Maja Matarić Pioneered Socially Assistive Robotics

    April 20, 2026

    Francis Bacon and the Scientific Method

    April 19, 2026

    Efficient Design and Simulation of LPDA-Fed Parabolic Reflector Antennas

    April 17, 2026

    IEEE Connects Hardware Startups With Investors

    April 16, 2026

    From RSA to Lattices: The Quantum Safe Crypto Shift

    April 15, 2026

    Stealth Satellite TV Defeats Iran’s Internet Blackout

    April 15, 2026

    Comments are closed.

    Editors Picks

    High-Endurance ASW and Strike USV

    April 21, 2026

    The competition watchdog just got a seat at the table in the legal battle between Epic Games and Apple

    April 21, 2026

    War Memes Are Turning Conflict Into Content

    April 21, 2026

    OnePlus Reveals New Phones Despite Uncertain Future

    April 21, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    ProgressPlay faces second fine for AML and social responsibility failures

    August 24, 2025

    A Smart, Modular Electric Bike

    October 23, 2025

    Hair styling heat releases harmful nanoparticles indoors

    August 24, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.