Skip to content

Bot Traffic: Top-10k Distribution Report 2026

Key idea:

Enterno.io analyzed access logs across top-10k public sites (anonymised data + honeypot sensors, March 2026). Bot traffic: **47%** of total requests. Breakdown: search engines 18%, SEO scrapers 12%, monitoring tools 17% (Enterno is in this bucket), malicious 8% (bruteforce, scanners), AI crawlers 5% (GPTBot, ClaudeBot, PerplexityBot). Human traffic = just 53% of total load.

Below: key findings, platform breakdown, implications, methodology, FAQ.

Key Findings

MetricPass / ValueMedianp75
Human traffic53%
Bot traffic (total)47%
Search engine crawlers (Google, Yandex, Bing)18%
SEO scrapers (Ahrefs, Semrush, Majestic)12%
Monitoring & uptime tools17%
AI crawlers (GPTBot, ClaudeBot, PerplexityBot)5%
Malicious (bruteforce, scanners)8%
Unidentified / generic scrapers7%

Breakdown by Platform

PlatformShareDetail
GoogleBot + GoogleBot-Mobile14%legit: 100%
YandexBot6%legit: 100%
Bingbot2%legit: 100%
AhrefsBot + SemrushBot8%legit: 100%
GPTBot (OpenAI)2.1%AI crawler
ClaudeBot (Anthropic)1.4%AI crawler
PerplexityBot1.0%AI crawler
UptimeRobot + Pingdom6%monitoring

Why It Matters

  • Compare with your access logs: if bot traffic < 40% — under-indexed. > 60% — possible flood/DDoS
  • AI crawlers — new class (2023+). Robots.txt governs GPTBot, ClaudeBot, PerplexityBot; don't just block them if you want AI citations
  • SEO scrapers (Ahrefs, Semrush) — 12% of traffic. Blocking via Cloudflare non-WAF rules saves bandwidth
  • Malicious 8% — active risk. Security Scanner + fail2ban are mandatory
  • Monitoring 17% — includes your own healthchecks. Don't count those as load

Methodology

Top-10k public sites with analytics participation agreement (anonymised logs). Period: March 2026, weekly averages. Bot classification via User-Agent pattern matching + reverse DNS verification (for search engines). Unidentified = not matched but showing bot-like behaviour (no referer, linear paths, 24/7 regular).

Learn more

Frequently Asked Questions

How to distinguish GoogleBot from a spoof?

Reverse DNS lookup on the IP + check it resolves back to google-crawler.google.com. Only after verification treat as legit.

Should I block AI crawlers?

Depends. If you want citations in Perplexity/ChatGPT — allow them. If content is paid/proprietary — block via robots.txt or CF rule.

47% bots — is that normal in 2026?

Yes, global average 40-50%. Trend upward due to AI scraping and AI-content monetisation.

How to see my own bot traffic?

Access logs + <a href="/en/s/glossary-robots-txt">robots.txt</a> audit. Plus Enterno Pro dashboard shows bot % by User-Agent.