Perplexity AI Is Reportedly Evading Website No-Crawl Directives
August 13, 2025 | by Admin

The internet is an incredible resource built on a foundation of unspoken trust. For decades, a simple, clear rule has guided the behavior of automated web crawlers: a site’s robots.txt file is a set of instructions that a bot is expected to follow. It’s a digital handshake, a way for website owners to say, “Welcome, but please don’t look here.” When a company chooses to disregard those instructions, it’s not just a technical issue—it’s a breach of that fundamental trust. This is what Perplexity AI reportedly is doing by taking steps to evade no-crawl policies.
Cloudflare claims that Perplexity AI actively evades website no-crawl directives
Recent findings by Cloudflare have brought the behavior of one such AI-powered “answer engine,” Perplexity, into question. According to an in-depth analysis, Perplexity is allegedly engaging in “stealth crawling.” This tactic involves actively trying to evade a website’s rules. The AI company is accused of using undeclared crawlers and rotating through different IP addresses to bypass block pages and continue scraping content. They reportedly do this even after a website owner has explicitly told them not to.
This kind of behavior, if true, can be labeled as unacceptable. It’s a direct violation of a website owner’s wishes and a clear move to access content that is not meant for public scraping. A bot that actively works to obscure its identity and bypass security measures is not a good-faith actor. We believe that products and services should respect the choices of content creators and publishers, who invest time, money, and effort into creating the content that these AI services rely on.
In contrast to other “well-behaved” AI platforms
In contrast, well-meaning bot operators follow a clear code of conduct. They are transparent, identifying themselves with a unique user agent and providing contact information. They are “well-behaved netizens,” respecting rate limits and not flooding sites with excessive traffic. Most importantly, they follow the rules by honoring robots.txt and other website directives. A good example of this is OpenAI, which clearly outlines its crawlers, explains their purpose, and respects a website’s wishes. Even so, the company has not been spared from lawsuits by publishers. However, OpenAI has demonstrated in a controlled test that its AI products immediately stopped crawling when instructed.
The rise of AI is changing the internet, but the core principles of respect and transparency should remain constant. Website owners deserve to have full control over how their content is used. They should not have to fight a battle against bots that actively try to go around their rules. We stand with the idea that content is an asset, and its creators should be empowered to decide who gets to access it and for what purpose. It’s about ensuring a fair and respectful digital ecosystem for everyone.
RELATED POSTS
View all