Google Announces Veo 2, Imagen 3 Update, and Whisk

Google generative AI image and video tools

Google has released updated versions of its video and image generation models, Veo 2 and Imagen 3, alongside a new tool called Whisk.

“Earlier this year, we introduced our video generation model, Veo, and our latest image generation model, Imagen 3,” Google’s Aäron van den Oord and Elias Roman write in the announcement post. “Today we’re introducing a new video model, Veo 2, and the latest version of Imagen 3, both of which achieve state-of-the-art results. These models are now available in VideoFX, ImageFX and our newest Labs experiment, Whisk.”

Well, that’s a lot to process.

Still image from a Veo 2-created video
Still image from a Veo 2-created video

Google did announce Veo alongside Imagen 3 back in May, but it didn’t release this video generation model until the beginning of December, and then only in private preview. Veo creates high-definition videos using an image as a prompt, and it provides a wide range of visual styles in videos of up to 60 seconds long. Veo 2 is an updated version of this model that offers improved understanding of real-world physics and more realism overall. You can specify genre, lens, and cinematic effects, and Veo will generate videos up to 4K quality and up to several minutes in length, Google says. It’s still in preview, but Google is expanding the availability to more users, and making the tool available via its Google Labs video generation tool, VideoFX. You can sign up for the waitlist on the Google Labs site.

Image created with Imagen 3
Image created with Imagen 3

Imagen 3 is what Google calls its highest quality text-to-image model. It can produce photorealistic, lifelike images, it says, with far fewer distracting visual artifacts than previous models. It’s been in select preview in ImageFX, a Google Labs tool, since May, and more recently in the Vertex AI developer tool, with a waitlist. With this week’s update, Imagen 3 now generates brighter, better composed images, Google claims, with more diverse art styles and greater accuracy. Better still, it’s far more broadly available, and in over 100 countries.

And then there’s Whisk. This is a new Google Labs tool that lets you create or input multiple images and then remix them into something unique. For example, you might input the photo of a person and an image of stuffed animal to create a fun, cartoon-like version of that person. Under the covers, Whisk uses Gemini’s image-to-text capabilities to create a prompt, which it then feeds to Imagen 3, allowing you to “easily remix your subjects, scenes, and styles in fun, new ways.”

Where Veo 2 and Imagen 3 seem technical and aimed at professionals, Whisk looks and feels more like a consumer offering to me. It’s available in the U.S. only starting today. You can find out more on The Keyword blog.

Remember when we thought Google was “behind” on AI? Yeah, me neither.

Tagged with

Share post

Thurrott