Google's New 'World Model' Genie 3 Trains AI Robots in Virtual Warehouses
Google DeepMind has unveiled Genie 3, a new “world model” capable of generating realistic virtual environments for training artificial intelligence systems. This development, according to the tech giant, represents a significant stride towards achieving Artificial General Intelligence (AGI), a hypothetical state where AI can perform a wide range of tasks at a human level, rather than being limited to specialized functions.
The Genie 3 model allows AI systems to interact within convincing simulations of the real world. Google suggests it could be instrumental in training robots and autonomous vehicles, for instance, by allowing them to navigate and learn within highly realistic virtual warehouses. DeepMind, Google’s AI division, emphasizes that such world models are a crucial component for the development of AI agents – systems designed to carry out tasks autonomously. The company anticipates this technology will play a critical role as AI agents become more prevalent and as the company advances towards AGI.
Genie 3 creates these simulated scenarios instantly from text prompts. Users can also quickly modify the virtual environment with further text commands; for example, introducing a herd of deer onto a ski slope. Beyond training AI, Google notes that Genie 3 could also enable humans to experience various simulations for training or exploration, such as virtual skiing or walking around a mountain lake.
While Google demonstrated virtual skiing and warehouse scenarios to journalists, the company stated that Genie 3 is not yet ready for full public release and did not provide a launch date, citing a range of limitations. The quality of these simulations is reportedly comparable to Google’s latest video creation model, Veo 3, but Genie 3’s simulations can last for minutes, significantly longer than Veo 3’s eight-second clips. This announcement comes amid escalating competition in the AI sector, following recent hints from OpenAI CEO Sam Altman about their upcoming GPT-5 model.
While discussions around AGI often focus on its potential impact on white-collar jobs as autonomous systems take on various roles, Google primarily views world models as a foundational technology for advancing robotics and autonomous vehicles. For instance, a simulated warehouse, complete with realistic physics and human interactions, could effectively train a robot, allowing it to learn and refine its actions in a safe, controlled environment. Google has also developed Sima, a virtual agent capable of performing tasks within video game settings, though like Genie 3, it is not publicly available.
Experts in the field underscore the importance of such models. Professor Subramanian Ramamoorthy, chair of robot learning and autonomy at the University of Edinburgh, described world models as “extremely important” for robot development. He explained, “To achieve flexible decision-making robots need to anticipate the consequences of different actions to choose the best one to execute in the physical world.”
Andrew Rogoyski of the Institute for People-Centred AI at the University of Surrey added that world models could also benefit large language models (LLMs), the technology underpinning chatbots like ChatGPT. He believes that providing a “disembodied AI the ability to be embodied, albeit virtually,” allows it to “explore the world, or a world – and grow in capabilities as a result.” This virtual physical exploration, he suggests, would add a vital dimension to creating more powerful and intelligent AIs, complementing their existing training on vast quantities of internet data. Google researchers previously noted that while LLMs excel at planning, they often lack the ability to take action on a human’s behalf, a gap that world models could help bridge.