
The AI startup Perplexity has come under fire for allegedly scraping content from websites that have explicitly blocked such activities. Cloudflare, a leading internet infrastructure provider, released a report on Monday detailing how Perplexity appeared to ignore these restrictions and concealed its scraping operations. Cloudflare's researchers revealed that the startup was attempting to bypass website preferences by obscuring its identity during the crawling process. This practice has raised concerns as AI companies, including Perplexity, often rely on vast amounts of online data to develop their products. While many websites have employed the Robots.txt standard to indicate which pages are off-limits, the effectiveness of such measures has been inconsistent. Evidence suggests that Perplexity has deliberately circumvented these restrictions. According to Cloudflare, the startup altered its bots' user agents—information that helps identify website visitors by their device type—and changed its Autonomous System Networks (ASN) to evade detection. The report noted that this behavior was observed across tens of thousands of domains, generating millions of requests daily. In response to these allegations, Perplexity spokesperson Jesse Dwyer dismissed Cloudflare's claims as a mere promotional tactic, asserting that the evidence presented did not show unauthorized access to any content. Dwyer also claimed that the bot cited in Cloudflare's report does not belong to their company. Cloudflare's concerns were initially raised after clients reported that Perplexity was crawling their sites despite having implemented measures to block its known bots. Following these complaints, Cloudflare conducted tests confirming that the startup was indeed bypassing these safeguards. The company further noted that Perplexity not only utilized its declared user-agent but also employed a generic browser designed to impersonate Google Chrome when access was denied. Consequently, Cloudflare has removed Perplexity's bots from its verified list and implemented new strategies to restrict their activity. This incident is not the first time Perplexity has been embroiled in controversy over scraping practices. Last year, various media outlets accused the startup of plagiarizing their content. During an interview at the Disrupt 2024 conference, Perplexity's CEO, Aravind Srinivas, struggled to articulate the company's stance on plagiarism in response to inquiries from TechCrunch. Cloudflare has taken a strong position against AI scrapers, recently launching a marketplace enabling website owners to charge fees to scrapers accessing their content. The company's CEO, Matthew Prince, highlighted the growing threat that AI poses to the traditional business models of online publishers.
Navigating the landscape of cutting-edge AI research can be exhilarating yet demanding. Prakhar Agarwal, an applied rese...
Business Insider | Mar 11, 2026, 04:25A groundbreaking development has occurred just outside Dublin, Ireland, where a new data center has become the first fac...
CNBC | Mar 11, 2026, 06:20
In a surprising turn of events, the FDA has chosen not to approve the use of the generic drug leucovorin for treating au...
Ars Technica | Mar 10, 2026, 22:15
In a dramatic turn of events, Anthropic's legal representative claims the U.S. government is actively encouraging the st...
Business Insider | Mar 11, 2026, 02:35In a strategic move amidst ongoing legal disputes involving Anthropic and the U.S. Department of Defense, Google is expa...
Business Today | Mar 11, 2026, 06:25