Robots.txt Validator - Crawl Rules Analyzer avatar

Robots.txt Validator - Crawl Rules Analyzer

Pricing

Pay per usage

Go to Apify Store
Robots.txt Validator - Crawl Rules Analyzer

Robots.txt Validator - Crawl Rules Analyzer

Analyze robots.txt files for any domain. Extract crawl rules, sitemaps, blocked paths, and crawl-delay settings. Validate configuration and identify SEO issues in bulk.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ava Torres

Ava Torres

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

20 hours ago

Last modified

Share

robots.txt Validator & Analyzer

Fetch, parse, and analyze robots.txt files for any domain in bulk. Built for SEO professionals, developers, and crawler operators who need to audit site access rules at scale.

What It Does

For each domain you supply, the actor:

  1. Fetches /robots.txt from the domain root over HTTPS (falls back gracefully on 404 or network errors)
  2. Parses all User-agent, Allow, Disallow, Crawl-delay, and Sitemap directives
  3. Reports structured rules grouped by user-agent
  4. Optionally checks whether specific paths are allowed or blocked for your chosen user-agent

Input

FieldTypeRequiredDescription
urlsstring[]YesDomains or full URLs (e.g. google.com, https://openai.com/blog)
userAgentstringNoUser-agent to evaluate rules for. Defaults to *
checkPathsstring[]NoSpecific paths to test for allow/disallow (e.g. /admin, /api/)
maxResultsintegerNoCap on domains to process. Defaults to 100

Output

One record per domain:

FieldDescription
domainDomain name
robotsTxtUrlFull URL of the fetched robots.txt
robotsTxtFoundtrue if HTTP 200 was returned
robotsTxtContentRaw robots.txt text
userAgentRulesParsed rule blocks, each with userAgent and rules array of {directive, path}
sitemapUrlsAll Sitemap URLs declared in the file
crawlDelayCrawl-delay in seconds for the requested user-agent (null if not set)
analyzedPathsPer-path results: {path, allowed} for each path in checkPaths
fetchErrorError message if the file could not be fetched

Example Use Cases

  • SEO audit: Check which bots can access which parts of your site
  • Crawler compliance: Verify your spider respects Disallow rules before running at scale
  • Competitive research: Understand what paths competitors block from indexing
  • Security review: Identify paths hidden from crawlers (admin panels, staging URLs)
  • Sitemap discovery: Extract all declared sitemap URLs without manual inspection

Pricing

$0.10 per 1,000 domains checked. Typical run of 100 domains costs less than $0.02.