AI agents failed at real-world consulting tasks — but Mercor's CEO says they're still on track to replace consultants

Recent assessments reveal that AI agents are still struggling to match the performance of human consultants in various real-world tasks. Conducted by Mercor, a prominent player in AI training, the research aimed to evaluate leading AI models in consulting, banking, and legal scenarios. Despite notable advancements, the AI agents succeeded in completing less than 25% of the assigned tasks on their first attempt, with an overall success rate of only 40% after multiple tries. Brendan Foody, Mercor's CEO, emphasizes that these preliminary results are only part of the broader picture. The benchmark, known as APEX-Agents, was crafted to mirror actual management consulting tasks, drawing input from experts at major firms like McKinsey and Deloitte. In this context, OpenAI's GPT 5.2 led the pack by accomplishing nearly 23% of tasks on its initial attempt, while Anthropic's newly released Opus 4.6 improved to nearly 33%. The evolution of these models is significant; for instance, GPT 3 had a mere 3% success rate in similar tasks just months ago. Foody anticipates that with continued enhancements, success rates could approach 50% by year-end. He notes that these AI models are making strides in handling complex tasks typically valued at millions of dollars by consulting firms. AI's influence is already reshaping the consulting landscape, evidenced by McKinsey's employment of 25,000 AI agents among its 60,000 workforce, allowing for unprecedented growth without increasing headcount. However, the research indicates that while AI agents excel in research and data analysis, they falter when tasks become more complex or time-consuming. They struggle to navigate file systems and manage multi-step processes effectively, leading to inaccuracies in their outputs. Foody likens the current performance of AI agents to that of interns, suggesting they achieve a 50% pass rate but still require substantial oversight. Insights from Frank Jones, a former consultant now consulting for Mercor, highlight the necessity of precise prompts for AI to meet consulting standards, as they often miss nuanced expectations. Looking ahead, Foody believes that enhancing AI models hinges not on groundbreaking innovations but on improved training methodologies. Mercor, which has attracted significant investment, is positioned as a major player in the AI landscape, aiming to refine these agents further. The next iteration of the benchmarking tool will assess the entire ecosystem of professional services, potentially revealing even more alarming implications for traditional consulting roles. Foody predicts that in the near future, AI chatbots could rival the capabilities of leading consulting firms.

Sources : Business Insider

Published On : Feb 09, 2026, 10:15

Startups

AI Transformations Lead to Major Job Cuts at Tech Giants

Monday.com, the innovative work management platform based in Tel Aviv, has recently announced significant layoffs, attri...

TechCrunch | Jul 26, 2026, 01:45

AI Transformations Lead to Major Job Cuts at Tech Giants

Streaming

Kalshi Challenges Netflix Over Controversial Documentary Trailer

Kalshi, the prediction market platform, has taken significant legal steps against Netflix, sending a cease-and-desist le...

TechCrunch | Jul 25, 2026, 17:10

Kalshi Challenges Netflix Over Controversial Documentary Trailer

Computing

Crisis Averted: Power Line Failure Highlights Urgent Need for Data Center Resilience

A power line failure near Washington, DC, recently showcased a significant challenge faced by the electrical grid due to...

TechCrunch | Jul 25, 2026, 13:50

Crisis Averted: Power Line Failure Highlights Urgent Need for Data Center Resilience

Startups

Warner Bros. Takes Legal Action Against Amazon Over Executive Poaching Allegations

Warner Bros. Discovery has initiated legal proceedings against Amazon, accusing the tech giant of unlawful interference ...

TechCrunch | Jul 25, 2026, 21:25

Warner Bros. Takes Legal Action Against Amazon Over Executive Poaching Allegations

Automotive

Uber's Former CEO Makes Waves with New Ventures Amidst Tesla's Earnings Update

In the ever-evolving landscape of transportation, recent developments have come to the forefront, particularly surroundi...

TechCrunch | Jul 26, 2026, 16:25

Uber's Former CEO Makes Waves with New Ventures Amidst Tesla's Earnings Update

View All News

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolio

case study

follow us on

follow us on

AI agents failed at real-world consulting tasks — but Mercor's CEO says they're still on track to replace consultants

AI Transformations Lead to Major Job Cuts at Tech Giants

Kalshi Challenges Netflix Over Controversial Documentary Trailer

Crisis Averted: Power Line Failure Highlights Urgent Need for Data Center Resilience

Warner Bros. Takes Legal Action Against Amazon Over Executive Poaching Allegations

Uber's Former CEO Makes Waves with New Ventures Amidst Tesla's Earnings Update

Collaborate with Benzatine Infotech

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolios

case study

follow us on

follow us on

portfolio

case study

follow us on

follow us on

AI agents failed at real-world consulting tasks — but Mercor's CEO says they're still on track to replace consultants

AI Transformations Lead to Major Job Cuts at Tech Giants

Kalshi Challenges Netflix Over Controversial Documentary Trailer

Crisis Averted: Power Line Failure Highlights Urgent Need for Data Center Resilience

Warner Bros. Takes Legal Action Against Amazon Over Executive Poaching Allegations

Uber's Former CEO Makes Waves with New Ventures Amidst Tesla's Earnings Update

Collaborate with Benzatine Infotech