Perplexity accused of scraping websites that explicitly blocked AI scraping

Perplexity accused of scraping websites that explicitly blocked AI scraping

The AI startup Perplexity has come under fire for allegedly scraping content from websites that have explicitly blocked such activities. Cloudflare, a leading internet infrastructure provider, released a report on Monday detailing how Perplexity appeared to ignore these restrictions and concealed its scraping operations. Cloudflare's researchers revealed that the startup was attempting to bypass website preferences by obscuring its identity during the crawling process. This practice has raised concerns as AI companies, including Perplexity, often rely on vast amounts of online data to develop their products. While many websites have employed the Robots.txt standard to indicate which pages are off-limits, the effectiveness of such measures has been inconsistent. Evidence suggests that Perplexity has deliberately circumvented these restrictions. According to Cloudflare, the startup altered its bots' user agents—information that helps identify website visitors by their device type—and changed its Autonomous System Networks (ASN) to evade detection. The report noted that this behavior was observed across tens of thousands of domains, generating millions of requests daily. In response to these allegations, Perplexity spokesperson Jesse Dwyer dismissed Cloudflare's claims as a mere promotional tactic, asserting that the evidence presented did not show unauthorized access to any content. Dwyer also claimed that the bot cited in Cloudflare's report does not belong to their company. Cloudflare's concerns were initially raised after clients reported that Perplexity was crawling their sites despite having implemented measures to block its known bots. Following these complaints, Cloudflare conducted tests confirming that the startup was indeed bypassing these safeguards. The company further noted that Perplexity not only utilized its declared user-agent but also employed a generic browser designed to impersonate Google Chrome when access was denied. Consequently, Cloudflare has removed Perplexity's bots from its verified list and implemented new strategies to restrict their activity. This incident is not the first time Perplexity has been embroiled in controversy over scraping practices. Last year, various media outlets accused the startup of plagiarizing their content. During an interview at the Disrupt 2024 conference, Perplexity's CEO, Aravind Srinivas, struggled to articulate the company's stance on plagiarism in response to inquiries from TechCrunch. Cloudflare has taken a strong position against AI scrapers, recently launching a marketplace enabling website owners to charge fees to scrapers accessing their content. The company's CEO, Matthew Prince, highlighted the growing threat that AI poses to the traditional business models of online publishers.

Sources : TechCrunch

Published On : Aug 04, 2025, 16:10

AI
Inside the Fast-Paced World of Meta's Superintelligence Labs: Insights from an AI Researcher

Navigating the landscape of cutting-edge AI research can be exhilarating yet demanding. Prakhar Agarwal, an applied rese...

Business Insider | Mar 11, 2026, 04:25
Inside the Fast-Paced World of Meta's Superintelligence Labs: Insights from an AI Researcher
Computing
Revolutionizing Power: Europe's First Microgrid-Enabled Data Center Launches Near Dublin

A groundbreaking development has occurred just outside Dublin, Ireland, where a new data center has become the first fac...

CNBC | Mar 11, 2026, 06:20
Revolutionizing Power: Europe's First Microgrid-Enabled Data Center Launches Near Dublin
Science
FDA's Unexpected Decision Leaves Autism Treatment Hopes Unfulfilled

In a surprising turn of events, the FDA has chosen not to approve the use of the generic drug leucovorin for treating au...

Ars Technica | Mar 10, 2026, 22:15
FDA's Unexpected Decision Leaves Autism Treatment Hopes Unfulfilled
AI
Anthropic Faces Pressure from Government Amid DOD Blacklisting Controversy

In a dramatic turn of events, Anthropic's legal representative claims the U.S. government is actively encouraging the st...

Business Insider | Mar 11, 2026, 02:35
Anthropic Faces Pressure from Government Amid DOD Blacklisting Controversy
AI
Google Unveils Innovative AI Agent Builder for Military Applications

In a strategic move amidst ongoing legal disputes involving Anthropic and the U.S. Department of Defense, Google is expa...

Business Today | Mar 11, 2026, 06:25
Google Unveils Innovative AI Agent Builder for Military Applications
View All News