Sakana AI’s Tree Quest: Deploy multi-model teams that outperform individual LLMs by 30%

Sakana AI, a pioneering Japanese AI research lab, has unveiled an innovative approach that allows multiple large language models (LLMs) to collaboratively tackle complex tasks. This technique, known as Multi-LLM AB-MCTS, effectively transforms various LLMs into a cohesive team, capable of outperforming individual models by a remarkable 30%. This method harnesses the distinct strengths of different AI models, enabling them to engage in trial-and-error processes that surpass the capabilities of any single model. Businesses can now dynamically select the most proficient models for specific aspects of a task, leading to more effective and versatile AI systems. Instead of being limited to a single AI provider, companies can leverage the unique attributes of various frontier models, which have been rapidly evolving but each carry their individual strengths and weaknesses. The researchers at Sakana AI emphasize that the diversity among models is not a limitation but a valuable asset. They assert that just as human achievements are often the result of diverse teams, AI can similarly reach new heights by working together. "Pooling their intelligence allows AI systems to tackle challenges that no single model could overcome," they noted in their official blog. The new algorithm employs a technique known as inference-time scaling, which has gained traction in recent months. While much of the AI focus has been on expanding model size and training datasets, this approach enhances performance post-training by allocating additional computational resources. Sakana AI's strategy integrates reinforcement learning and repeated sampling methods, refining existing solutions while also exploring new ones. At the core of Multi-LLM AB-MCTS lies the Adaptive Branching Monte Carlo Tree Search (AB-MCTS) algorithm. This sophisticated method balances two search strategies: deepening existing solutions and generating new ones. By utilizing probability models, AB-MCTS intelligently determines the most effective course of action at each step. The team rigorously tested their system against the challenging ARC-AGI-2 benchmark, known for assessing human-like visual reasoning abilities. By combining various frontier models, they achieved correct solutions for over 30% of the test problems, significantly outperforming any individual model. The system demonstrated a remarkable capacity to assign the most suitable model for each task, often discovering solutions previously deemed unattainable. In one instance, after an initial incorrect solution was produced, the system effectively utilized other models to analyze and correct the error, showcasing its collaborative potential. The researchers highlighted that this ensemble approach could mitigate common issues such as the models' tendency to hallucinate, which is particularly critical in business applications. To facilitate the adoption of this revolutionary technique, Sakana AI has made the foundational algorithm open-source under the Apache 2.0 license. This framework, named Tree Quest, offers a flexible API that allows developers to implement Multi-LLM AB-MCTS tailored to their specific needs. As they continue to explore practical applications, the team sees promising potential in various domains, including algorithmic coding and optimizing machine learning model accuracy. The introduction of this open-source tool could be a significant step toward more robust and reliable enterprise AI solutions.

Sources : VentureBeat

Published On : Jul 08, 2025, 05:38

Cybersecurity

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Since Donald Trump’s presidency began, the founder of FTX, Sam Bankman-Fried, has been on a mission to rebrand himself a...

Ars Technica | Mar 12, 2026, 19:00

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Automotive

Lucid Motors Unveils Ambitious Plans for Affordable Electric SUVs

Lucid Motors is setting its sights on the bustling midsize SUV market, a move that could prove pivotal for the company's...

Ars Technica | Mar 12, 2026, 17:55

Lucid Motors Unveils Ambitious Plans for Affordable Electric SUVs

Nvidia GTC 2026: What to Expect from Jensen Huang's Keynote and Event Highlights

Nvidia is set to launch its annual GTC developer conference next week in San Jose, California, with the highly anticipat...

TechCrunch | Mar 12, 2026, 23:45

Nvidia GTC 2026: What to Expect from Jensen Huang's Keynote and Event Highlights

Startups

Revelations Unveil Live Nation's Ticketing Tactics Amid Legal Scrutiny

Recently released documents have revealed startling admissions from a regional director at Live Nation, who allegedly br...

Ars Technica | Mar 12, 2026, 20:50

Revelations Unveil Live Nation's Ticketing Tactics Amid Legal Scrutiny

Atlassian Embraces AI Revolution with Significant Workforce Reductions

In a bold move reflecting the growing influence of artificial intelligence, Atlassian, the Australian productivity softw...

TechCrunch | Mar 12, 2026, 17:45

Atlassian Embraces AI Revolution with Significant Workforce Reductions

View All News

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolio

case study

follow us on

follow us on

Sakana AI’s Tree Quest: Deploy multi-model teams that outperform individual LLMs by 30%

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Lucid Motors Unveils Ambitious Plans for Affordable Electric SUVs

Nvidia GTC 2026: What to Expect from Jensen Huang's Keynote and Event Highlights

Revelations Unveil Live Nation's Ticketing Tactics Amid Legal Scrutiny

Atlassian Embraces AI Revolution with Significant Workforce Reductions

Collaborate with Benzatine Infotech

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolios

case study

follow us on

follow us on

portfolio

case study

follow us on

follow us on

Sakana AI’s Tree Quest: Deploy multi-model teams that outperform individual LLMs by 30%

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Lucid Motors Unveils Ambitious Plans for Affordable Electric SUVs

Nvidia GTC 2026: What to Expect from Jensen Huang's Keynote and Event Highlights

Revelations Unveil Live Nation's Ticketing Tactics Amid Legal Scrutiny

Atlassian Embraces AI Revolution with Significant Workforce Reductions

Collaborate with Benzatine Infotech