Cloudflare has alleged that artificial intelligence search startup Perplexity is intentionally bypassing website restrictions designed to stop its bots from crawling content. In a recent report, the internet infrastructure giant claimed that Perplexity’s bots have been evading standard protections after being blocked.
According to Cloudflare, multiple customers raised concerns that Perplexity’s bots continued to access their websites, even after explicitly being disallowed through robots.txt files and advanced Web Application Firewall (WAF) rules. To validate the claims, Cloudflare conducted its own test using decoy domains configured with similar restrictions.
How Perplexity Allegedly Circumvented the Blocks
The investigation revealed that Perplexity initially identified its bots with user agents labeled “PerplexityBot” or “Perplexity-User.” However, when access was denied, the startup reportedly shifted strategies. Cloudflare claims the bots began posing as Google Chrome browsers on macOS and employed rotating IP addresses to avoid detection.
Furthermore, the report alleges that Perplexity altered its Autonomous System Networks (ASN) — a method of rerouting traffic through different network pathways — to slip past site-level access controls. This pattern of behavior, according to Cloudflare, was not isolated but widespread, affecting tens of thousands of domains and resulting in millions of requests each day.
In response to these activities, Cloudflare delisted Perplexity as a verified bot on its platform and implemented updated methods to help website owners effectively block the startup’s crawlers from accessing their content.
Perplexity’s Response
Reacting to the accusations, Jesse Dwyer, a spokesperson for Perplexity, dismissed the claims as nothing more than a “publicity stunt.” He also stated that Cloudflare’s blog post reflected “a lot of misunderstandings” about how Perplexity operates.
Broader Context and Industry Tensions
This controversy unfolds at a time when AI companies are under increasing scrutiny for their data collection methods, particularly around web scraping and content usage without consent. Matthew Prince, CEO of Cloudflare, has previously raised concerns about unregulated AI scraping, calling it an “existential threat” to content publishers and digital platforms.
Last month, Cloudflare took additional steps to address the issue by blocking AI crawlers by default. The company also introduced a feature that allows website owners to charge for access to their content when it is requested by AI systems.
The ongoing dispute highlights the growing tension between AI developers eager to access vast amounts of web data and infrastructure providers working to protect digital property and enforce ethical data usage.
Tags:
Subscribe To Get Update Latest Blog Post
No Credit Card Required
