Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Danish investor PSV Tech launches a €70 million Fund II to support the next tech giants of the Nordics
    • Best Wireless Headphones (2025): Tested Over Many Hours
    • Jury orders NSO to pay $167 million for hacking WhatsApp users
    • Experts Share If Cortisol Supplements Can Really Lower Stress
    • How Co-op averted an even worse cyber attack
    • 8 Uncensored AI Chatbots That Actually Talk Like You Do
    • The real impact of AI on your organization
    • 1940s railroad car becomes unique tiny house
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, May 19
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»News»Farewell Photoshop? Google’s new AI lets you edit images by asking.
    News

    Farewell Photoshop? Google’s new AI lets you edit images by asking.

    Editor Times FeaturedBy Editor Times FeaturedMarch 21, 2025No Comments3 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Multimodal output opens up new prospects

    Having true multimodal output opens up attention-grabbing new prospects in chatbots. For instance, Gemini 2.0 Flash can play interactive graphical video games or generate tales with constant illustrations, sustaining character and setting continuity all through a number of photos. It is from excellent, however character consistency is a brand new functionality in AI assistants. We tried it out and it was fairly wild—particularly when it generated a view of a photograph we supplied from one other angle.

    Making a multi-image story with Gemini 2.0 Flash, half 1.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 1.

    Google / Benj Edwards


    Creating a multi-image story with Gemini 2.0 Flash, part 2. Notice the alternative angle of the original photo.

    Making a multi-image story with Gemini 2.0 Flash, half 2. Discover the choice angle of the unique photograph.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 2. Discover the choice angle of the unique photograph.

    Google / Benj Edwards


    Creating a multi-image story with Gemini 2.0 Flash, part 3.

    Making a multi-image story with Gemini 2.0 Flash, half 3.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 3.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 2. Discover the choice angle of the unique photograph.

    Google / Benj Edwards

    Making a multi-image story with Gemini 2.0 Flash, half 3.

    Google / Benj Edwards

    Textual content rendering represents one other potential power of the mannequin. Google claims that inner benchmarks present Gemini 2.0 Flash performs higher than “main aggressive fashions” when producing photos containing textual content, making it doubtlessly appropriate for creating content material with built-in textual content. From our expertise, the outcomes weren’t that thrilling, however they had been legible.

    An example of in-image text rendering generated with Gemini 2.0 Flash.

    An instance of in-image textual content rendering generated with Gemini 2.0 Flash.


    Credit score:

    Google / Ars Technica

    Regardless of Gemini 2.0 Flash’s shortcomings to date, the emergence of true multimodal picture output appears like a notable second in AI historical past due to what it suggests if the know-how continues to enhance. In the event you think about a future, say 10 years from now, the place a sufficiently complicated AI mannequin may generate any sort of media in actual time—textual content, photos, audio, video, 3D graphics, 3D-printed bodily objects, and interactive experiences—you mainly have a holodeck, however with out the matter replication.

    Coming again to actuality, it is nonetheless “early days” for multimodal picture output, and Google acknowledges that. Recall that Flash 2.0 is meant to be a smaller AI mannequin that’s sooner and cheaper to run, so it hasn’t absorbed your entire breadth of the Web. All that data takes a whole lot of area by way of parameter depend, and extra parameters means extra compute. As a substitute, Google educated Gemini 2.0 Flash by feeding it a curated dataset that additionally possible included focused artificial knowledge. Consequently, the mannequin doesn’t “know” the whole lot visible in regards to the world, and Google itself says the coaching knowledge is “broad and normal, not absolute or full.”

    That is only a fancy manner of claiming that the picture output high quality is not excellent—but. However there may be loads of room for enchancment sooner or later to include extra visible “data” as coaching strategies advance and compute drops in price. If the method turns into something like we have seen with diffusion-based AI picture turbines like Steady Diffusion, Midjourney, and Flux, multimodal picture output high quality could enhance quickly over a brief time frame. Prepare for a totally fluid media actuality.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Jury orders NSO to pay $167 million for hacking WhatsApp users

    May 19, 2025

    VMware perpetual license holders receive cease-and-desist letters from Broadcom

    May 19, 2025

    WhatsApp provides no cryptographic management for group messages

    May 19, 2025

    Trump admin to roll back Biden’s AI chip restrictions

    May 19, 2025

    DOGE software engineer’s computer infected by info-stealing malware

    May 19, 2025

    AI use damages professional reputation, study suggests

    May 19, 2025

    Comments are closed.

    Editors Picks

    Danish investor PSV Tech launches a €70 million Fund II to support the next tech giants of the Nordics

    May 19, 2025

    Best Wireless Headphones (2025): Tested Over Many Hours

    May 19, 2025

    Jury orders NSO to pay $167 million for hacking WhatsApp users

    May 19, 2025

    Experts Share If Cortisol Supplements Can Really Lower Stress

    May 19, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Not just better but cheaper too

    February 4, 2025

    NYT Connections: Sports Edition Puzzle Comes Out of Beta on Super Bowl Sunday

    February 4, 2025

    On Cloud 9: Why Everyone Wants to Take Their Business to the Cloud

    March 18, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.