Perplexity accused of scraping websites that explicitly blocked AI scraping

Perplexity accused of scraping websites that explicitly blocked AI scraping

The AI startup Perplexity has come under fire for allegedly scraping content from websites that have explicitly blocked such activities. Cloudflare, a leading internet infrastructure provider, released a report on Monday detailing how Perplexity appeared to ignore these restrictions and concealed its scraping operations. Cloudflare's researchers revealed that the startup was attempting to bypass website preferences by obscuring its identity during the crawling process. This practice has raised concerns as AI companies, including Perplexity, often rely on vast amounts of online data to develop their products. While many websites have employed the Robots.txt standard to indicate which pages are off-limits, the effectiveness of such measures has been inconsistent. Evidence suggests that Perplexity has deliberately circumvented these restrictions. According to Cloudflare, the startup altered its bots' user agents—information that helps identify website visitors by their device type—and changed its Autonomous System Networks (ASN) to evade detection. The report noted that this behavior was observed across tens of thousands of domains, generating millions of requests daily. In response to these allegations, Perplexity spokesperson Jesse Dwyer dismissed Cloudflare's claims as a mere promotional tactic, asserting that the evidence presented did not show unauthorized access to any content. Dwyer also claimed that the bot cited in Cloudflare's report does not belong to their company. Cloudflare's concerns were initially raised after clients reported that Perplexity was crawling their sites despite having implemented measures to block its known bots. Following these complaints, Cloudflare conducted tests confirming that the startup was indeed bypassing these safeguards. The company further noted that Perplexity not only utilized its declared user-agent but also employed a generic browser designed to impersonate Google Chrome when access was denied. Consequently, Cloudflare has removed Perplexity's bots from its verified list and implemented new strategies to restrict their activity. This incident is not the first time Perplexity has been embroiled in controversy over scraping practices. Last year, various media outlets accused the startup of plagiarizing their content. During an interview at the Disrupt 2024 conference, Perplexity's CEO, Aravind Srinivas, struggled to articulate the company's stance on plagiarism in response to inquiries from TechCrunch. Cloudflare has taken a strong position against AI scrapers, recently launching a marketplace enabling website owners to charge fees to scrapers accessing their content. The company's CEO, Matthew Prince, highlighted the growing threat that AI poses to the traditional business models of online publishers.

Sources : TechCrunch

Published On : Aug 04, 2025, 16:10

Computing
Colorado's Right-to-Repair Law Faces Pushback from Tech Giants

The movement for right-to-repair legislation is making significant strides across the United States, particularly in Col...

Ars Technica | Apr 04, 2026, 20:45
Colorado's Right-to-Repair Law Faces Pushback from Tech Giants
Computing
Oracle's Mass Layoffs Spark Controversial Advice Among Employees

This week, Oracle shocked thousands of its workforce by announcing widespread layoffs via early morning emails, leaving ...

Business Today | Apr 04, 2026, 06:10
Oracle's Mass Layoffs Spark Controversial Advice Among Employees
AI
Mass Exodus at xAI: Musk Faces Leadership Crisis Ahead of IPO

In a dramatic turn of events, xAI, the artificial intelligence venture co-founded by Elon Musk, has seen a swift departu...

Business Insider | Apr 04, 2026, 09:35
Mass Exodus at xAI: Musk Faces Leadership Crisis Ahead of IPO
Cybersecurity
Falling Debris from Aerial Interception Damages Oracle's Dubai Office Amid Rising Tensions

In a significant incident linked to escalating regional tensions, the Dubai office of Oracle sustained damage from debri...

CNBC | Apr 04, 2026, 09:25
Falling Debris from Aerial Interception Damages Oracle's Dubai Office Amid Rising Tensions
Startups
Delve Cuts Ties with Y Combinator Amid Controversy

In a significant turn of events, compliance startup Delve has ended its association with Y Combinator, as confirmed by t...

TechCrunch | Apr 04, 2026, 21:30
Delve Cuts Ties with Y Combinator Amid Controversy
View All News