AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says

The AI search engine Perplexity is under scrutiny as Cloudflare, a prominent network security and optimization service, claims it is employing covert methods to bypass no-crawl directives set by websites. This accusation, if substantiated, contravenes established internet protocols that have been upheld for over thirty years. In a recent blog post, Cloudflare revealed that it had received multiple complaints from clients who had taken measures to prevent Perplexity’s scraping bots from accessing their content. These measures included configuring their robots.txt files and utilizing web application firewalls to block known Perplexity crawlers. However, despite these restrictions, Cloudflare observed that Perplexity continued to gain access to the sites’ data. Following these reports, Cloudflare's researchers conducted their own investigation. They discovered that when Perplexity's recognized crawlers encountered blocks from robots.txt files or firewalls, the company resorted to using stealthy bots that employed various techniques to conceal their activities. According to the researchers, this clandestine crawler utilized numerous IP addresses not registered within Perplexity’s official range, frequently rotating through these addresses to evade detection. Furthermore, Cloudflare noted that requests from different Autonomous System Numbers (ASNs) were observed, indicating an ongoing attempt to circumvent website restrictions. This evasive behavior was detected across a vast number of domains, generating millions of requests daily. If these allegations are accurate, they represent a significant breach of internet etiquette established decades ago. The Robots Exclusion Protocol, proposed by engineer Martijn Koster in 1994, created a standardized way for websites to communicate restrictions to crawlers through a simple robots.txt file. This protocol, which was formally recognized as a standard by the Internet Engineering Task Force in 2022, has been widely adopted and respected throughout the internet community.

Sources : Ars Technica

Published On : Aug 04, 2025, 19:21

AI
The Ethical Dilemma of Digital Afterlives: Scott Adams' AI Controversy

The debate surrounding the digital representation of deceased individuals has intensified with the emergence of an AI-ge...

Business Insider | Feb 20, 2026, 09:25
The Ethical Dilemma of Digital Afterlives: Scott Adams' AI Controversy
Computing
Three Engineers Indicted for Allegedly Stealing Google Secrets and Transferring Data to Iran

In a significant development, a federal grand jury has indicted three engineers from Silicon Valley on allegations of st...

CNBC | Feb 20, 2026, 05:45
Three Engineers Indicted for Allegedly Stealing Google Secrets and Transferring Data to Iran
Startups
Unexpected Beneficiaries: Japanese Firms Thrive Amid AI Chip Demand

The surge in artificial intelligence is not only benefiting major tech companies and chip manufacturers but also surpris...

Business Insider | Feb 20, 2026, 08:25
Unexpected Beneficiaries: Japanese Firms Thrive Amid AI Chip Demand
AI
Microsoft Aims to Empower 20 Million Indians with AI Skills by 2030

At the recent AI Impact Summit, Amanda Craig from Microsoft addressed the ongoing challenges related to AI skill shortag...

Business Today | Feb 20, 2026, 06:10
Microsoft Aims to Empower 20 Million Indians with AI Skills by 2030
AI
India to Unveil Groundbreaking 8 Exaflop AI Supercomputer in Partnership with G42 and Cerebras

A revolutionary AI supercomputer boasting an astonishing 8 exaflops of computing power is set to be deployed in India, t...

Business Today | Feb 20, 2026, 05:35
India to Unveil Groundbreaking 8 Exaflop AI Supercomputer in Partnership with G42 and Cerebras
View All News