Pricing

Pay per usage

Go to Apify Store

Robots Txt Analyzer

Try for free

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Actor stats

Bookmarked

Total users

Monthly active users

a day ago

Last modified

Robots.txt Analyzer

Fetch and analyze robots.txt files in bulk for SEO auditing and crawl compliance. This actor extracts allowed and disallowed paths, sitemaps, crawl delays, and user-agent specific rules from any website's robots.txt file.

Features

Bulk robots.txt analysis for multiple domains
Extract all user-agent rules, allowed paths, and disallowed paths
Find sitemaps referenced in robots.txt
Check crawl delay directives for specific user-agents
Filter rules by target user-agent (Googlebot, Bingbot, etc.)
Raw robots.txt content included in output

Use Cases

SEO professionals use robots.txt analysis to ensure search engine crawlers can access important pages. Technical SEO audits require checking that no critical pages are accidentally blocked. Web scraping teams verify compliance with robots.txt before building crawlers. Competitive analysis involves comparing robots.txt configurations across competitors to understand their crawl budget strategies.

Input Configuration

Parameter	Type	Description
`urls`	Array	List of website URLs to analyze robots.txt from
`userAgent`	String	Specific user-agent to check rules for (default: Googlebot)

Output Format

Each result contains:

url - The robots.txt URL analyzed
domain - The website domain
hasRobotsTxt - Whether a robots.txt file exists
statusCode - HTTP status code of the robots.txt request
userAgentCount - Number of user-agent sections found
sitemaps - Array of sitemap URLs listed in robots.txt
allowedPaths - Paths allowed for the checked user-agent
disallowedPaths - Paths disallowed for the checked user-agent
crawlDelay - Crawl delay value if specified
allRules - Complete parsed rules for all user-agents

SEO Applications

Understanding robots.txt is fundamental to technical SEO. Use this tool to audit your own site's crawl directives, verify competitor sitemap locations, and ensure your crawl infrastructure respects website policies. The Apify API makes it easy to schedule regular audits.

Limitations

The actor parses standard robots.txt directives. Non-standard directives or malformed files may not be fully parsed. Some websites serve different robots.txt content based on the requesting IP or user-agent. The actor uses a generic user-agent for fetching but checks rules against your specified target user-agent.

Notion Marketplace Scraper

webdatalabs/notion-marketplace-scraper

Scrape templates, categories, ratings, and creator profiles from Notion's official template marketplace. Perfect for competitive analysis, market research, creator monitoring, and discovering trending Notion templates.

WebDataLabs

5.0

(2)

Website Metadata Extractor(sitemap, socialLinks, robotsTxt)

codescraper/website-metadata-extractor

A very fast metadata extractor to get all meta tags, robots.txt, sitemaps, social links, H1s, word count, and JSON-LD data. Also provides technology detection for a full analysis. Get your data fast for just $3/month.

CodeScraper

5.0

(1)

SiteScanner - Visual Tech Stack Analyzer

lenient_grove/SiteScanner---Visual-Tech-Stack-Analyzer

Uncover any website's technology stack in seconds with a premium, enterprise-grade visual report.

Tejas Rawool

Orphan Content Analyzer

diao-bah-timbi/orphan-content-analyzer

Detect orphan and ghost pages that silently hurt your SEO. Crawl your website, analyze internal links, and instantly identify pages with zero inbound links. Ideal for SEO audits, content optimization, and large-scale website analysis.

Mamadou Diao Bah

Brand DNA

quantifiable_bouquet/brand-dna

Extract structured brand identity from any website — colors, fonts, tone, positioning, and reusable marketing templates — using deterministic heuristics with no AI hallucinations.

Hayder Al-Khalissi

Bug Bounty Recon Scanner

iamuendo/Bug-Bounty-Recon-Scanner

Find exposed admin panels, missing/weak security headers, sensitive file leaks, and HTTPS misconfigurations across target domains. Export prioritised risk scores and JSON reports. Run via API, schedule scans, or integrate with bug bounty tools.

Isaac Muendo

Goodreads Quotes

shahidirfan/Goodreads-Quotes

Goodreads Quotes Scraper: Effortlessly extract quotes, authors, and tags with this lightweight Goodreads Quotes Scraper. Designed for speed and reliability, it gathers wisdom from thousands of pages instantly. For the best performance, residential proxies are highly recommended.

Shahid Irfan

5.0

(1)

Google Ads Transparency Analyzer

amernas/google-ads-transparency-analyzer

Google Ads Transparency Analyzer Apify actor that extracts competitor ad data from Google Ads Transparency Center. Features batch processing, FULL/LITE modes, date/region filtering, and proxy support. Automates competitive intelligence gathering for marketers and analysts. Outputs to Apify datasets.

Traffic Architect

155

1.6

(2)

Product Review Sentiment Analyzer

devwithbobby/product-review-sentiment-analyzer

Scrapes product reviews from e-commerce websites and analyzes sentiment using OpenAI or Hugging Face APIs.

Dev with Bobby

Google Ads Analyzer

amernas/google-ads-analyzer

Extract ad data from Google Ads Transparency Center by domain. Three modes: FULL (basic data), OCR (AI text extraction from images - headlines, descriptions, URLs), and LITE (summary counts). Filter by date range and region. Perfect for competitor analysis and ad research.

Traffic Architect

1.0

(1)