
Google DeepMind has unveiled two innovative AI models as part of its Gemini Robotics series, designed to significantly enhance the capabilities of general-purpose robots. The models, known as Gemini Robotics-ER 1.5 and Gemini Robotics 1.5, work in tandem to improve reasoning, vision, and action in real-world scenarios. The Gemini Robotics-ER 1.5 acts as the planner or orchestrator, while Gemini Robotics 1.5 executes tasks based on natural language commands. This two-model approach aims to overcome the limitations of previous AI systems, which often combined planning and execution in a single unit, leading to potential errors and delays. Gemini Robotics-ER 1.5 is highlighted as a vision-language model (VLM) that excels in advanced reasoning and tool integration. It is capable of generating multi-step plans for tasks and has shown strong performance in spatial understanding benchmarks. Notably, this model can leverage external tools, including Google Search, to enhance decision-making in physical environments. Once a plan is established, the Gemini Robotics 1.5, a vision-language-action (VLA) model, translates instructions and visual inputs into precise motor commands, enabling the robot to execute the task. This model evaluates the most efficient route to complete an action while providing explanations of its decision-making process in natural language. This sophisticated system is engineered to empower robots to manage complex, multi-step commands seamlessly. For instance, a robot could efficiently sort items into compost, recycling, and trash bins by first consulting local recycling guidelines online, analyzing the items, planning the sorting process, and executing the actions accordingly. DeepMind indicates that these AI models are adaptable to various robot shapes and sizes due to their spatial awareness and flexible design. Currently, the orchestrator model, Gemini Robotics-ER 1.5, is available to developers through the Gemini API in Google AI Studio, while the VLA model is accessible to select partners. This advancement represents a significant step towards integrating generative AI into robotics, transitioning from traditional interfaces to natural language-driven control, while also separating planning from execution to minimize errors.
Roy Lee, the co-founder and CEO of Cluely, has publicly acknowledged that the $7 million in annual recurring revenue he ...
TechCrunch | Mar 05, 2026, 23:05
Recent reports indicate that hackers have infiltrated the FBI's networks, raising serious security concerns. On Thursday...
TechCrunch | Mar 05, 2026, 22:10
In a significant development, Microsoft has confirmed that Anthropic's AI solutions will continue to be offered on its p...
Business Insider | Mar 06, 2026, 05:05On Thursday, Anthropic addressed concerns regarding the Pentagon's classification of the AI firm as a 'supply chain risk...
CNN | Mar 06, 2026, 02:50
The recent Morgan Stanley Tech, Media, and Telecom conference showcased a formidable lineup of industry leaders, includi...
CNBC | Mar 05, 2026, 23:25