Ai2’s Molmo Act model ‘thinks in 3D’ to challenge Nvidia and Google in robotics AI

Ai2’s Molmo Act model ‘thinks in 3D’ to challenge Nvidia and Google in robotics AI

The realm of physical AI is rapidly evolving, integrating robotics with advanced foundation models. Leading players such as Nvidia and Google have been at the forefront of this innovation, but the Allen Institute for AI (Ai2) is now entering the fray with its groundbreaking Molmo Act 7B. This new open-source model empowers robots to 'think' in three-dimensional space, enhancing their operational capabilities. Molmo Act builds on Ai2’s previous work with the Molmo model, introducing features that allow for sophisticated action reasoning within a physical environment. The model not only provides its training data openly but also operates under an Apache 2.0 license, while its datasets are available under CC BY-4.0. According to Ai2, Molmo Act excels in understanding spatial dynamics, enabling robots to navigate their surroundings more effectively. Unlike traditional vision-language-action (VLA) models, which often lack spatial reasoning, Molmo Act is designed to interpret and interact with the physical world. Ai2's representatives highlighted that this capability differentiates Molmo Act, making it more efficient and adaptable in various contexts. The model's potential applications are vast, with a particular focus on home environments, where irregularities and constant changes present significant challenges for robotics. To achieve its spatial comprehension, Molmo Act utilizes 'spatially grounded perception tokens.' These tokens are derived through a vector-quantized variational autoencoder, translating various inputs, including video, into quantifiable data. This innovative approach allows for a deeper understanding of geometric structures and distances between objects, which is crucial for effective navigation and interaction. Once equipped with this spatial awareness, Molmo Act can generate a sequence of waypoints, guiding robots in executing precise movements, such as adjusting an arm or performing intricate tasks. Ai2's assessments revealed an impressive task success rate of 72.1% for Molmo Act 7B, outperforming competitors from Google, Microsoft, and Nvidia. Experts in robotics are taking note of this advancement. Alan Fern, a professor at Oregon State University, remarked that Ai2’s work signifies a vital step toward refining VLMs for robotics and physical reasoning. Although he acknowledged that more progress is needed, he commended the shift towards 3D scene understanding as a significant leap forward. The accessibility of the model's data is also noteworthy. Daniel Maturana, co-founder of Gather AI, expressed enthusiasm for the implications this openness holds for research and development in the field, particularly given the high costs associated with building and training such models. The aspiration to create robots that are not only intelligent but also spatially aware has long been a goal for developers. Historically, programming every robotic action was tedious and limited flexibility. However, with the rise of LLM-based approaches, robots can now autonomously determine potential actions based on their interactions with the environment. In a landscape where physical AI is increasingly recognized as the next frontier, Ai2's Molmo Act offers a promising foundation for future innovations in robotic intelligence. As the technology matures, the possibility of achieving general physical intelligence—where robots can perform complex tasks without needing detailed programming—becomes more attainable, signaling an exciting era for robotics.

Sources : VentureBeat

Published On : Aug 15, 2025, 24:45

Automotive
Honda Halts Production of Three Electric Models Amid Financial Struggles

In a significant shift in strategy, Honda has decided to halt the production of three electric vehicle models that were ...

Ars Technica | Mar 12, 2026, 12:45
Honda Halts Production of Three Electric Models Amid Financial Struggles
Startups
Meet the Youngest Billionaire: Surya Midha's Rapid Rise in the AI Startup World

At just 22 years old, Surya Midha has made headlines as one of the youngest self-made billionaires, sharing this impress...

Business Today | Mar 12, 2026, 11:40
Meet the Youngest Billionaire: Surya Midha's Rapid Rise in the AI Startup World
Startups
Blue Owl Capital Accelerates AI Infrastructure Investments Amid Market Challenges

Blue Owl Capital is intensifying its focus on artificial intelligence infrastructure, recently committing significant fu...

Business Insider | Mar 12, 2026, 10:15
Blue Owl Capital Accelerates AI Infrastructure Investments Amid Market Challenges
AI
The Rise and Fall of OpenClaw: Users Pay to Uninstall AI Tool Amid Security Concerns

In China, the OpenClaw phenomenon has taken an unexpected turn, creating a unique economic ecosystem around the AI agent...

Business Insider | Mar 12, 2026, 08:45
The Rise and Fall of OpenClaw: Users Pay to Uninstall AI Tool Amid Security Concerns
Startups
India's Ambitious $11 Billion Initiative to Transform Chip Manufacturing

India is set to make a significant investment in its semiconductor industry with plans for an $11 billion fund aimed at ...

Business Today | Mar 12, 2026, 09:50
India's Ambitious $11 Billion Initiative to Transform Chip Manufacturing
View All News