The world of AI imaging is moving too fast. It is even difficult to keep up with everything that is coming out. Last week, Meta showed the world that they, too, were in this business. Those of Zuckerberg showed a series of short videos that they had managed to generate with a simple text input. It has barely been a week, but Google has already surpassed the level of those of Facebook with Image Video , an intelligence that seems to have a lot of potential.
Keeping up with advances in AI is exhausting

We are in an unprecedented historical moment. Some AI applications are developing so fast that there’s barely time to process a new technology when the next one has already arrived to outperform it. A little over a month ago, Stable Diffusion was presented as a free and open source AI. A real revolution.
Last week, DreamBooth changed the way we use Stable Diffusion, as the system allows us to train AI with our face or any type of concept that comes to mind. DreamBooth originally required professional Nvidia hardware, but in a matter of hours, the community made so many forks that it became possible to run the program on a home computer. It was not the only important news of the week either. Meta also showed the world its advances in this sector. They showed a series of short videos of AI-generated figures. As we said, Google has not been slow to overcome its competition .
Google takes a step forward with the ‘text to video’
A few weeks ago, the popularizer Carlos Santana (dotcsv) raised on YouTube if it was possible to make a movie with an AI. In his presentation, the artificial intelligence expert saw that the scenario was still complicated, but not impossible.
As we say, this world advances at a very frenetic pace. Just yesterday, Google showed the world Image Video , an artificial intelligence capable of generating short videos using a natural language text command. The project was introduced on Twitter by Jonathan Ho. The programmer showed a short five-second video of leaves falling into a lake that formed the words ‘Image Video’. On the surface, it doesn’t seem like anything spectacular, but the truth is that, to date, practically none of the AIs that we know of know how to generate text within images.
Jonathan Ho@hojonathanhoExcited to announce Image Video, our new text-conditioned video diffusion model that generates 1280×768 24fps HD videos! #ImageVideo
https://t.co/JWj3L7MpBU
Work w/ @wchan212 @Chitwan_Saharia @jaywhang_ @RuiqiGao @agritsenko @dpkingma @poolio @mo_norouzi @fleet_dj @TimSalimans https://t.co/eN81LqZW7IOctober 05, 2022 • 19:293.3K
243
The link in the post shows a bit more about this technology. This is an extra application from Google Image Research , which works very similarly to Dall-E 2. Google Image Video allows you to create clips with HD resolution (1280 by 768 pixels) at 24 frames per second . The difference from what Meta showed last week is notable, as Mark Zuckerberg’s company simply showed some renders of vector objects that revolve around a camera. Not at all, a result as striking and useful as this technology that Mountain View has just presented.
The goal of this post is clearly to show the world that they are ahead of Meta in this arena. However, it is still early to know what future plans Google has with this program.