Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Apple iPhone 16E Specs vs. iPhone 15 Pro: New Entry-Level or Last Year’s Pro
    • The US factory that lays bare the contradiction in Trump’s policy
    • The Automation Trap: Why Low-Code AI Models Fail When You Scale
    • Inside the story that enraged OpenAI
    • Robots-Blog | BerryBot: STEM Education for Young Engineers with a wooden robot
    • a modular rugged smartphone with impressive features
    • Revolut bets big on France with €1 billion investment and dual HQ model
    • How to Win Followers and Scamfluence People
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, May 19
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»News»Farewell Photoshop? Google’s new AI lets you edit images by asking.
    News

    Farewell Photoshop? Google’s new AI lets you edit images by asking.

    Editor Times FeaturedBy Editor Times FeaturedMarch 21, 2025No Comments3 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Multimodal output opens up new prospects

    Having true multimodal output opens up attention-grabbing new prospects in chatbots. For instance, Gemini 2.0 Flash can play interactive graphical video games or generate tales with constant illustrations, sustaining character and setting continuity all through a number of photos. It is from excellent, however character consistency is a brand new functionality in AI assistants. We tried it out and it was fairly wild—particularly when it generated a view of a photograph we supplied from one other angle.

    Making a multi-image story with Gemini 2.0 Flash, half 1.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 1.

    Google / Benj Edwards


    Creating a multi-image story with Gemini 2.0 Flash, part 2. Notice the alternative angle of the original photo.

    Making a multi-image story with Gemini 2.0 Flash, half 2. Discover the choice angle of the unique photograph.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 2. Discover the choice angle of the unique photograph.

    Google / Benj Edwards


    Creating a multi-image story with Gemini 2.0 Flash, part 3.

    Making a multi-image story with Gemini 2.0 Flash, half 3.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 3.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 2. Discover the choice angle of the unique photograph.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 3.

    Google / Benj Edwards

    Textual content rendering represents one other potential power of the mannequin. Google claims that inner benchmarks present Gemini 2.0 Flash performs higher than “main aggressive fashions” when producing photos containing textual content, making it doubtlessly appropriate for creating content material with built-in textual content. From our expertise, the outcomes weren’t that thrilling, however they had been legible.

    An example of in-image text rendering generated with Gemini 2.0 Flash.

    An instance of in-image textual content rendering generated with Gemini 2.0 Flash.


    Credit score:

    Google / Ars Technica

    Regardless of Gemini 2.0 Flash’s shortcomings to date, the emergence of true multimodal picture output appears like a notable second in AI historical past due to what it suggests if the know-how continues to enhance. In the event you think about a future, say 10 years from now, the place a sufficiently complicated AI mannequin may generate any sort of media in actual time—textual content, photos, audio, video, 3D graphics, 3D-printed bodily objects, and interactive experiences—you mainly have a holodeck, however with out the matter replication.

    Coming again to actuality, it is nonetheless “early days” for multimodal picture output, and Google acknowledges that. Recall that Flash 2.0 is meant to be a smaller AI mannequin that’s sooner and cheaper to run, so it hasn’t absorbed your entire breadth of the Web. All that data takes a whole lot of area by way of parameter depend, and extra parameters means extra compute. As a substitute, Google educated Gemini 2.0 Flash by feeding it a curated dataset that additionally possible included focused artificial knowledge. Consequently, the mannequin doesn’t “know” the whole lot visible in regards to the world, and Google itself says the coaching knowledge is “broad and normal, not absolute or full.”

    That is only a fancy manner of claiming that the picture output high quality is not excellent—but. However there may be loads of room for enchancment sooner or later to include extra visible “data” as coaching strategies advance and compute drops in price. If the method turns into something like we have seen with diffusion-based AI picture turbines like Steady Diffusion, Midjourney, and Flux, multimodal picture output high quality could enhance quickly over a brief time frame. Prepare for a totally fluid media actuality.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    WhatsApp provides no cryptographic management for group messages

    May 19, 2025

    Trump admin to roll back Biden’s AI chip restrictions

    May 19, 2025

    DOGE software engineer’s computer infected by info-stealing malware

    May 19, 2025

    AI use damages professional reputation, study suggests

    May 19, 2025

    New pope chose his name based on AI’s threats to “human dignity”

    May 18, 2025

    New attack can steal cryptocurrency by planting false memories in AI chatbots

    May 18, 2025

    Comments are closed.

    Editors Picks

    Apple iPhone 16E Specs vs. iPhone 15 Pro: New Entry-Level or Last Year’s Pro

    May 19, 2025

    The US factory that lays bare the contradiction in Trump’s policy

    May 19, 2025

    The Automation Trap: Why Low-Code AI Models Fail When You Scale

    May 19, 2025

    Inside the story that enraged OpenAI

    May 19, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Stern Pinball’s Brand-New King Kong Game Is Totally Bananas

    April 15, 2025

    Best AI Girl Generators in 2024

    December 1, 2024

    North Korean hackers stole $1.3bn in crypto this year, report says

    December 20, 2024
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.