AI has already run out of training data — but there's more waiting to be unlocked, Goldman's data chief says

In the rapidly evolving landscape of artificial intelligence, a significant challenge has emerged: the availability of training data. Neema Raphael, the chief data officer at Goldman Sachs, recently expressed concerns about the current shortage of data during an episode of the bank's 'Exchanges' podcast. Raphael highlighted that this scarcity might be influencing the architecture of new AI systems. He mentioned China's DeepSeek as a case study, suggesting that its development costs could stem from utilizing the outputs of existing models instead of sourcing entirely new data. "The most intriguing aspect is how prior models will influence the next generation of AI technologies," he stated. With the traditional internet sources becoming limited, developers are increasingly looking towards synthetic data, which includes machine-generated text, images, and code. While this approach offers a seemingly unlimited supply, it also runs the risk of inundating AI models with low-quality information. Nevertheless, Raphael remains optimistic, asserting that the absence of fresh data won't severely hinder progress, particularly because many companies possess untapped data reserves. He remarked, "From a consumer standpoint, the surge in synthetic data is fascinating. However, in the enterprise realm, there's still significant potential to be unlocked." This suggests that the future of AI may hinge more on proprietary datasets held by corporations than on freely available internet information. Companies like Goldman Sachs, with vast amounts of data from trading activities and client interactions, could enhance AI tools significantly if leveraged correctly. Raphael's insights come amid industry discussions about having reached 'peak data' since the emergence of ChatGPT three years ago. At a recent conference, OpenAI co-founder Ilya Sutskever cautioned that all valuable online data has already been utilized in training models, hinting that the rapid advancement of AI might soon plateau. Moreover, Raphael emphasized that the challenge lies not only in sourcing more data but also in ensuring that it is applicable. "The key issues are understanding the data, its business context, and normalizing it for effective use within the business," he explained. He also raised thought-provoking questions regarding the reliance on synthetic data, pondering whether this could lead to a 'creative plateau' in AI. "If the data is predominantly machine-generated, how much human-derived information can still be integrated?" he questioned, indicating that this is a crucial aspect to monitor from both a technological and philosophical viewpoint.

Sources : Business Insider

Published On : Oct 02, 2025, 05:35

Sam Altman Faces Lawmakers Over OpenAI's Military Collaboration

Sam Altman, the CEO of OpenAI, recently engaged in a crucial dialogue with several lawmakers in Washington, D.C., where ...

CNBC | Mar 12, 2026, 20:25

Sam Altman Faces Lawmakers Over OpenAI's Military Collaboration

Writers Take Legal Action Against Grammarly Over Unauthorized Use of Their Expertise

Grammarly has recently unveiled a contentious new feature that employs artificial intelligence to replicate editorial fe...

TechCrunch | Mar 12, 2026, 17:00

Writers Take Legal Action Against Grammarly Over Unauthorized Use of Their Expertise

Cybersecurity

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Since Donald Trump’s presidency began, the founder of FTX, Sam Bankman-Fried, has been on a mission to rebrand himself a...

Ars Technica | Mar 12, 2026, 19:00

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Startups

Meta AI Revolutionizes Buyer-Seller Interactions on Facebook Marketplace

Facebook Marketplace is enhancing its platform with innovative Meta AI functionalities aimed at streamlining communicati...

TechCrunch | Mar 12, 2026, 18:45

Meta AI Revolutionizes Buyer-Seller Interactions on Facebook Marketplace

Streaming

Substack Unveils Innovative Recording Studio for Creators

Substack is making significant strides in the realm of video content with the introduction of its new Substack Recording...

TechCrunch | Mar 12, 2026, 18:45

Substack Unveils Innovative Recording Studio for Creators

View All News

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolio

case study

follow us on

follow us on

AI has already run out of training data — but there's more waiting to be unlocked, Goldman's data chief says

Sam Altman Faces Lawmakers Over OpenAI's Military Collaboration

Writers Take Legal Action Against Grammarly Over Unauthorized Use of Their Expertise

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Meta AI Revolutionizes Buyer-Seller Interactions on Facebook Marketplace

Substack Unveils Innovative Recording Studio for Creators

Collaborate with Benzatine Infotech

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolios

case study

follow us on

follow us on

portfolio

case study

follow us on

follow us on

AI has already run out of training data — but there's more waiting to be unlocked, Goldman's data chief says

Sam Altman Faces Lawmakers Over OpenAI's Military Collaboration

Writers Take Legal Action Against Grammarly Over Unauthorized Use of Their Expertise

Sam Bankman-Fried's Political Pivot Fails to Impress Trump’s Justice Department

Meta AI Revolutionizes Buyer-Seller Interactions on Facebook Marketplace

Substack Unveils Innovative Recording Studio for Creators

Collaborate with Benzatine Infotech