Robots Txt Analyzer avatar

Robots Txt Analyzer

Pricing

Pay per usage

Go to Apify Store
Robots Txt Analyzer

Robots Txt Analyzer

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Donny Nguyen

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

a day ago

Last modified

Categories

Share

Robots.txt Analyzer

Fetch and analyze robots.txt files in bulk for SEO auditing and crawl compliance. This actor extracts allowed and disallowed paths, sitemaps, crawl delays, and user-agent specific rules from any website's robots.txt file.

Features

  • Bulk robots.txt analysis for multiple domains
  • Extract all user-agent rules, allowed paths, and disallowed paths
  • Find sitemaps referenced in robots.txt
  • Check crawl delay directives for specific user-agents
  • Filter rules by target user-agent (Googlebot, Bingbot, etc.)
  • Raw robots.txt content included in output

Use Cases

SEO professionals use robots.txt analysis to ensure search engine crawlers can access important pages. Technical SEO audits require checking that no critical pages are accidentally blocked. Web scraping teams verify compliance with robots.txt before building crawlers. Competitive analysis involves comparing robots.txt configurations across competitors to understand their crawl budget strategies.

Input Configuration

ParameterTypeDescription
urlsArrayList of website URLs to analyze robots.txt from
userAgentStringSpecific user-agent to check rules for (default: Googlebot)

Output Format

Each result contains:

  • url - The robots.txt URL analyzed
  • domain - The website domain
  • hasRobotsTxt - Whether a robots.txt file exists
  • statusCode - HTTP status code of the robots.txt request
  • userAgentCount - Number of user-agent sections found
  • sitemaps - Array of sitemap URLs listed in robots.txt
  • allowedPaths - Paths allowed for the checked user-agent
  • disallowedPaths - Paths disallowed for the checked user-agent
  • crawlDelay - Crawl delay value if specified
  • allRules - Complete parsed rules for all user-agents

SEO Applications

Understanding robots.txt is fundamental to technical SEO. Use this tool to audit your own site's crawl directives, verify competitor sitemap locations, and ensure your crawl infrastructure respects website policies. The Apify API makes it easy to schedule regular audits.

Limitations

The actor parses standard robots.txt directives. Non-standard directives or malformed files may not be fully parsed. Some websites serve different robots.txt content based on the requesting IP or user-agent. The actor uses a generic user-agent for fetching but checks rules against your specified target user-agent.