Imperva Bad Bot Report 2026: bot traffic = 50.2 % of all web traffic (record). Breakdown: 32 % bad bots (scrapers, credential stuffing, DDoS), 18.2 % good bots (Googlebot, monitoring, feeds). Biggest 2024-2026 growth: AI scrapers — GPTBot, ClaudeBot, PerplexityBot, ByteSpider (13 → 28 % of bot traffic). Niche trend: residential proxies + headless Chrome bypass WAF. Mitigations: Content-Signal (IETF), Cloudflare Bot Fight Mode, fingerprint-based JA3/JA4.
Below: details, example, related, FAQ.
# robots.txt with Content-Signal (2026)
User-agent: *
Content-Signal: search=yes, ai-train=no, ai-search=yes
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
# nginx — rate-limit AI scrapers
map {
default 0;
~*(gpt|claude|perplexity|anthropic|cohere)bot 1;
}
limit_req_zone zone=aibots:10m rate=30r/m;
limit_req zone=aibots burst=10 nodelay if ();No. Googlebot, monitoring, RSS feeds — legitimate. The problem is the 32 % malicious share.
User-Agent + reverse DNS + behavioral fingerprinting (JA3/JA4). UA alone isn't enough (spoofable).
Depends: publishers — block (protect monetization). SaaS docs — allow (AI sends traffic back via links).