Robots.txt & Sitemap Validator
Check robots.txt syntax, crawl rules, and sitemap accessibility for any domain
robots.txt checker parses your file and shows which User-agents are allowed/denied, highlights common errors (bad syntax, blocking important paths, CSS/JS in Disallow). Validates sitemap references and Clean-param directives for Yandex.
Want a weekly re-check of this?
Drop your email — we will re-run this check every 7 days and alert you if anything degrades (SSL expiry, DNS change, header regression). Free.
One-click unsubscribe in every email. We never share email addresses. By subscribing you agree to our privacy policy.
Robots.txt Checker
The tool analyzes your site's robots.txt file, which controls search engine crawler access to pages. Rules for all user-agents, Allow/Disallow directives, Crawl-delay, and Sitemap links are checked. An incorrect robots.txt can lead to deindexing of important pages or exposing internal sections.
Common robots.txt mistakes include blocking CSS/JS files (breaking rendering for Google), missing Sitemap directive, using Allow/Disallow without leading slash, and conflicting rules for the same path. Our validator catches these issues and shows which URLs are blocked for each user agent.
Always test robots.txt changes before deploying to production — a single typo can deindex your entire site. After validation, check broken links to ensure blocked pages aren't linked from active content. Review your security headers to make sure sensitive paths are properly protected.
Why teams trust us
How it works
Enter site URL
Parse robots.txt
Check crawl rules
Why check robots.txt?
robots.txt controls which pages search bots can see. Incorrect directives can accidentally block the entire site from indexing or expose administrative sections.
Full Parsing
Parse robots.txt per RFC 9309: all User-agent, Allow/Disallow, Crawl-delay, Sitemap.
URL Tester
Enter a specific URL and User-agent — find out if it's allowed for that bot.
AI Crawlers
Automatically show status for GPTBot, ClaudeBot, PerplexityBot, Googlebot.
Sitemap List
All Sitemap: directives in one place with quick links for verification.
Who uses this
SEO
crawl directive audit
Developers
post-deploy check
Marketers
indexation control
Site owners
block unwanted crawlers
Common Mistakes
Best Practices
User-agent: * applies to all bots, including AI crawlers.Sitemap: https://example.com/sitemap.xml helps bots find all pages.Get more with a free account
Robots.txt check history and change monitoring for your site.
Sign up freeLearn more
Frequently Asked Questions
What is robots.txt?
robots.txt is a text file at the root of a site that tells search bots which pages can or cannot be indexed. It is a recommendation, not a mandatory block — malicious bots may ignore it.
What is the difference between robots.txt and meta robots?
robots.txt blocks crawling (the bot will not visit the page). Meta robots (noindex) blocks indexing (the bot visits but does not add to index). For full blocking, both are needed. If robots.txt blocks a page, the bot will not see meta noindex.
How to correctly specify Sitemap in robots.txt?
Add the line Sitemap: https://example.com/sitemap.xml at the end of the file. The URL must be absolute. Multiple sitemaps can be specified. This helps bots find the sitemap faster.
What is Crawl-delay?
Crawl-delay is a robots.txt directive that sets a pause between bot requests in seconds. Yandex and Bing support it. Google ignores Crawl-delay — Google's crawl rate is configured in Search Console.
What are common robots.txt mistakes?
Common mistakes: blocking CSS/JS files (prevents rendering), Disallow: / (blocks entire site), missing file (bot considers everything allowed), blocking /api/ without Allow for /api/docs, incorrect User-agent capitalization.
How to check robots.txt?
Our tool analyzes syntax, checks file accessibility, finds conflicting rules, and warns about potential issues. You can also use Google Search Console to test specific URLs.
Related guides
Longer-form reading on this topic from the knowledge base.
Automate this check
Set up continuous monitoring and get an alert when something breaks. No manual runs to remember.