Google researchers find the best AI model is 69% right

Google researchers find the best AI model is 69% right

Recent developments from Google DeepMind have shed light on the reliability of artificial intelligence in providing accurate information. The introduction of the FACTS Benchmark Suite aims to evaluate how effectively AI models deliver factually correct responses. This suite assesses models across four critical areas: answering straightforward questions, utilizing web searches, grounding responses in lengthy documents, and analyzing images. The standout performer in this evaluation, Google's Gemini 3 Pro, achieved an accuracy rate of 69%. In comparison, many other leading models fell significantly short. To put this into perspective, if any journalist under my supervision submitted articles with a 69% accuracy rate, their position would be in jeopardy. This statistic is particularly significant for businesses that rely on AI technology. While these models are impressive in terms of speed and language fluency, their factual accuracy remains a concern, especially when it comes to complex reasoning or niche knowledge. In critical sectors like finance, healthcare, and legal fields, even minor inaccuracies can lead to serious repercussions. A recent analysis by my colleague Melia Russell highlighted the challenges law firms face with the integration of AI as a credible source of legal information. One notable case involved a firm dismissing an employee who submitted a document filled with fabricated legal cases after using ChatGPT for assistance. The FACTS Benchmark serves as both a cautionary tale and a guide for future improvements, as Google aims to identify and address the shortcomings of these AI models. The primary takeaway remains clear: while AI is making strides, it still falls short of human-level accuracy, being incorrect nearly a third of the time.

Sources : Business Insider

Published On : Dec 12, 2025, 21:30

AI
Nvidia GTC 2026: What to Expect from Jensen Huang's Keynote and Event Highlights

Nvidia is set to launch its annual GTC developer conference next week in San Jose, California, with the highly anticipat...

TechCrunch | Mar 12, 2026, 23:45
Nvidia GTC 2026: What to Expect from Jensen Huang's Keynote and Event Highlights
AI
Adobe's Leadership Shake-Up: CEO Shantanu Narayen Steps Down Amidst AI Revolution

In a significant shift for the company, Adobe has announced that its long-serving CEO, Shantanu Narayen, will be steppin...

Business Today | Mar 13, 2026, 03:15
Adobe's Leadership Shake-Up: CEO Shantanu Narayen Steps Down Amidst AI Revolution
AI
AI Boosts U.S. Military Edge, Says Palantir CEO Amid Rising Tensions

During an interview with CNBC, Palantir's CEO Alex Karp emphasized the significant advantage that artificial intelligenc...

CNBC | Mar 12, 2026, 22:05
AI Boosts U.S. Military Edge, Says Palantir CEO Amid Rising Tensions
Mobile
Google Maps Unveils AI-Enhanced Features for a Seamless Navigation Experience

Google Maps is set to revolutionize the way users navigate their surroundings with the introduction of innovative AI-dri...

Business Today | Mar 13, 2026, 06:00
Google Maps Unveils AI-Enhanced Features for a Seamless Navigation Experience
AI
Mastering AI in Coding: Insights from an Amazon Tech Lead

In the rapidly evolving world of technology, understanding the nuances of coding remains crucial, especially when harnes...

Business Insider | Mar 13, 2026, 07:10
Mastering AI in Coding: Insights from an Amazon Tech Lead
View All News