Crawler blocked guide

What to do when your site blocks our crawler

The free sitemap and llms.txt generators need to fetch public pages and root files. If your host returns a challenge, 403, 429, or 503, use this checklist before rerunning the tool.

Generate sitemap.xml Generate llms.txt

Allow the audit user agent

If your WAF blocks unknown bots, allow the audit user agent for public HTML, robots.txt, sitemap.xml, and llms.txt.

User-Agent: layzr.ai-agentic-audit/1.0

Bypass challenges for public files

Security challenges are useful for forms and private routes, but they often block crawler-facing files that should stay public.

/robots.txt
/sitemap.xml
/llms.txt
/llms-full.txt

Check CDN bot settings

Cloudflare, Vercel, and other edge providers can challenge automated requests before your app receives them.

Look for bot fight mode, WAF rules, rate limits, and security checkpoints.

Publish static files at the root

When crawlers cannot execute your app, static root files are the most reliable discovery path.

https://example.com/robots.txt
https://example.com/sitemap.xml
https://example.com/llms.txt

What happened

Your public page returned a bot challenge

The generators stop when they cannot safely read the site. We do not try to bypass bot protection, solve challenges, or store blocked responses.

Practical path

Confirm /robots.txt, /sitemap.xml, and /llms.txt are reachable in a private browser window.
Check your CDN or host security logs for requests using layzr.ai-agentic-audit/1.0.
Allow public GET requests to root discovery files, then rerun the sitemap or llms.txt generator.
If you cannot change WAF rules, hand-write the sitemap.xml and llms.txt files from the public URLs you already know.

Try another free tool

Agentic auditCheck if AI crawlers can read your website files.Open tool Sitemap generatorGenerate sitemap.xml from a sitemap or homepage crawl.Open tool llms.txt generatorTurn discovered URLs into an AI-ready markdown index.Open tool

What to do when your site blocks our crawler

Allow the audit user agent

Bypass challenges for public files

Check CDN bot settings

Publish static files at the root

Your public page returned a bot challenge

Practical path

Try another free tool

Site links

Legal

What we offer

Compare