Google DeepMind unveiled the successor to the Genie artificial intelligence (AI) model, which could generate endless 2D game worlds, on Wednesday. Dubbed Genie 2, the new AI model is capable of generating unique action-controllable, playable 3D environments based on a single image prompt. Calling Genie 2 an AI “world model”, the company stated that it can generate up to minute-long environments with consistent objects. The company said these generated worlds could be played by humans or can be used to train AI agents.
Table of Contents
Google DeepMind Unveils Genie 2 AI Model
In a blog post, the company detailed the new AI model and its capabilities. While its predecessor could only generate game worlds for 2D platformer games, the Genie 2 AI model can generate 3D worlds complete with consistent models that can be interacted with. This means humans or AI agents can walk, run, swim, climb, and perform more actions in these environments.
Genie 2’s generative capabilities allow it to generate routes, buildings, and objects that cannot be seen in the input image. These elements are designed and rendered by the model from scratch. Additionally, the foundation model is also capable of maintaining consistency in these environments. This means even when a player moves away from one area and returns back, the environments remain the same.
Apart from this, Genie 2 is capable of generating different perspectives such as first-person views, isometric views, or third-person views. Further, users can also interact with the objects in the generated worlds and can perform actions such as opening a door, bursting a balloon, or climbing a ladder. The model can also be prompted to generate physics-related effects such as water ripples, smoke, gravity, directional lighting, reflections, and more.
Coming to the technical details, DeepMind explained that Genie 2 is an autoregressive latent diffusion model and has been trained on a large video dataset. The transformer architecture also includes an autoencoder which enables frame-by-frame generation of these worlds.
Notably, DeepMind also released an AI model dubbed Scalable Instructable Multiworld Agent or SIMA earlier this year, which is essentially capable of agentic AI functions in 3D worlds. The company says Genie 2 is capable of providing unique environments to similar AI agents and training them for various real-life scenarios.
Since the world model can generate unique environments, Google says this will eliminate the risk of data contamination and will allow developers to correctly assess an AI agent’s capabilities.