Science journalists find ChatGPT is bad at summarizing scientific papers

In the realm of science journalism, the ability to distill complex research findings into accessible summaries is a critical skill. This task is often seen as an ideal application for large language models, including ChatGPT. However, a recent informal year-long study conducted by the American Association for the Advancement of Science (AAAS) has raised questions about the efficacy of ChatGPT in performing this role. The AAAS set out to explore whether ChatGPT could generate concise "news brief" summaries akin to those produced by their dedicated SciPak team for the journal Science and platforms like EurekAlert. These summaries are meticulously designed to present essential information about a study, including its objectives, methodologies, and broader context, aimed at aiding journalists in crafting their articles. In their findings, detailed in a recent blog post and accompanying white paper, the AAAS journalists concluded that while ChatGPT could somewhat mimic the format of SciPak-style briefs, its outputs often prioritized simplicity at the expense of accuracy. As a result, the summaries frequently required extensive fact-checking by experienced SciPak writers. Abigail Eisenstadt, a writer at AAAS, remarked that while these AI technologies show promise as potential aids for science journalists, they are not yet ready for mainstream use within the SciPak team. Over the course of the study from December 2023 to December 2024, AAAS researchers tasked ChatGPT with summarizing up to two scientific papers each week, using a range of prompts that varied in specificity. The focus was on papers that contained challenging elements such as technical jargon, controversial findings, groundbreaking insights, studies involving human subjects, or unconventional formats. The summaries were generated using the latest available versions of GPT models during the study period, primarily GPT-4 and GPT-4o. In total, 64 papers were summarized, with the results assessed both quantitatively and qualitatively by the same SciPak writers who had originally briefed those papers. The researchers acknowledged a limitation in their design, noting that it could not account for potential human biases, which might be particularly pronounced among journalists scrutinizing a tool that threatens to encroach upon their professional territory.

Sources : Ars Technica

Published On : Sep 19, 2025, 17:20

Lawsuit Claims Perplexity's 'Incognito Mode' Fails to Protect User Privacy

A recent lawsuit has raised serious allegations against Perplexity, the AI search engine, claiming that its 'Incognito M...

Ars Technica | Apr 02, 2026, 21:00

Lawsuit Claims Perplexity's 'Incognito Mode' Fails to Protect User Privacy

OpenAI Expands Horizons with Acquisition of Viral Tech Show TBPN

In a surprising shift towards the media landscape, OpenAI has acquired TBPN, a tech talk show that has gained significan...

Business Insider | Apr 02, 2026, 18:50

OpenAI Expands Horizons with Acquisition of Viral Tech Show TBPN

Cybersecurity

Data Breach Exposes Customer Information at Telehealth Firm Hims & Hers

Hims & Hers, a prominent telehealth company specializing in weight loss and sexual health medications, has reported a da...

TechCrunch | Apr 02, 2026, 22:00

Data Breach Exposes Customer Information at Telehealth Firm Hims & Hers

Science

Unveiling the Secrets of Octopus Courtship: How Hormones Influence Mating

Octopuses are among the most extraordinary creatures inhabiting our planet. With no bones, they possess the remarkable a...

Ars Technica | Apr 02, 2026, 19:50

Unveiling the Secrets of Octopus Courtship: How Hormones Influence Mating

OpenAI Expands Its Reach with Acquisition of TBPN Talk Show

OpenAI has taken a significant step into the media landscape by acquiring the popular tech talk show, TBPN (Technology B...

TechCrunch | Apr 02, 2026, 19:35

OpenAI Expands Its Reach with Acquisition of TBPN Talk Show

View All News

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolio

case study

follow us on

follow us on

Science journalists find ChatGPT is bad at summarizing scientific papers

Lawsuit Claims Perplexity's 'Incognito Mode' Fails to Protect User Privacy

OpenAI Expands Horizons with Acquisition of Viral Tech Show TBPN

Data Breach Exposes Customer Information at Telehealth Firm Hims & Hers

Unveiling the Secrets of Octopus Courtship: How Hormones Influence Mating

OpenAI Expands Its Reach with Acquisition of TBPN Talk Show

Collaborate with Benzatine Infotech

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolios

case study

follow us on

follow us on

portfolio

case study

follow us on

follow us on

Science journalists find ChatGPT is bad at summarizing scientific papers

Lawsuit Claims Perplexity's 'Incognito Mode' Fails to Protect User Privacy

OpenAI Expands Horizons with Acquisition of Viral Tech Show TBPN

Data Breach Exposes Customer Information at Telehealth Firm Hims & Hers

Unveiling the Secrets of Octopus Courtship: How Hormones Influence Mating

OpenAI Expands Its Reach with Acquisition of TBPN Talk Show

Collaborate with Benzatine Infotech