Robots.txt Validator - Crawl Rules Analyzer
Pricing
Pay per usage
Go to Apify Store

Robots.txt Validator - Crawl Rules Analyzer
Analyze robots.txt files for any domain. Extract crawl rules, sitemaps, blocked paths, and crawl-delay settings. Validate configuration and identify SEO issues in bulk.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Ava Torres
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
20 hours ago
Last modified
Categories
Share
robots.txt Validator & Analyzer
Fetch, parse, and analyze robots.txt files for any domain in bulk. Built for SEO professionals, developers, and crawler operators who need to audit site access rules at scale.
What It Does
For each domain you supply, the actor:
- Fetches
/robots.txtfrom the domain root over HTTPS (falls back gracefully on 404 or network errors) - Parses all
User-agent,Allow,Disallow,Crawl-delay, andSitemapdirectives - Reports structured rules grouped by user-agent
- Optionally checks whether specific paths are allowed or blocked for your chosen user-agent
Input
| Field | Type | Required | Description |
|---|---|---|---|
urls | string[] | Yes | Domains or full URLs (e.g. google.com, https://openai.com/blog) |
userAgent | string | No | User-agent to evaluate rules for. Defaults to * |
checkPaths | string[] | No | Specific paths to test for allow/disallow (e.g. /admin, /api/) |
maxResults | integer | No | Cap on domains to process. Defaults to 100 |
Output
One record per domain:
| Field | Description |
|---|---|
domain | Domain name |
robotsTxtUrl | Full URL of the fetched robots.txt |
robotsTxtFound | true if HTTP 200 was returned |
robotsTxtContent | Raw robots.txt text |
userAgentRules | Parsed rule blocks, each with userAgent and rules array of {directive, path} |
sitemapUrls | All Sitemap URLs declared in the file |
crawlDelay | Crawl-delay in seconds for the requested user-agent (null if not set) |
analyzedPaths | Per-path results: {path, allowed} for each path in checkPaths |
fetchError | Error message if the file could not be fetched |
Example Use Cases
- SEO audit: Check which bots can access which parts of your site
- Crawler compliance: Verify your spider respects
Disallowrules before running at scale - Competitive research: Understand what paths competitors block from indexing
- Security review: Identify paths hidden from crawlers (admin panels, staging URLs)
- Sitemap discovery: Extract all declared sitemap URLs without manual inspection
Pricing
$0.10 per 1,000 domains checked. Typical run of 100 domains costs less than $0.02.