Robots.txt Checker - CMS-Aware Analysis with AI Recommendations
Pricing
from $0.01 / 1,000 results
Robots.txt Checker - CMS-Aware Analysis with AI Recommendations
The Robots.txt Checker provides comprehensive analysis of your robots.txt file: Syntax Validation CMS Detection - Identify WordPress, Shopify, Drupal,& 6+ other CMS platforms Best Practice Check Companion File Checks - sitemap.xml, llms.txt, security.txt AI Recommendations - CMS-specific suggestions
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer

John Rippy
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Validate robots.txt syntax, detect CMS patterns, and get AI-powered optimization advice by John Rippy | johnrippy.link
What This Actor Does
The Robots.txt Checker provides comprehensive analysis of your robots.txt file:
- Syntax Validation - Detect parsing errors and malformed directives
- CMS Detection - Identify WordPress, Shopify, Drupal, and 6+ other CMS platforms
- Best Practice Checks - Verify sitemap declarations, crawl delays, blocked paths
- Companion File Checks - Validate sitemap.xml, llms.txt, security.txt
- AI Recommendations - CMS-specific optimization suggestions
Why Use This Actor?
The Problem with Manual Checking
Most developers paste robots.txt into a validator and get syntax errors, but miss:
- CMS-specific paths that should be blocked
- Missing sitemap declarations
- Accidental blocking of important content
- Security and AI crawler considerations
CMS-Aware Intelligence
This actor detects your CMS and provides targeted recommendations:
| CMS Detected | Smart Recommendations |
|---|---|
| WordPress | Block /wp-admin/, /wp-json/, /?s= search pages |
| Shopify | Block /cart/, /checkout/, /admin/, /search? |
| Drupal | Block /node/, /user/, /admin/, filter paths |
| Magento | Block /checkout/, /customer/, /catalogsearch/ |
| Wix | Block /_api/, /_partials/, internal paths |
Use Cases
1. SEO Audits
Verify clients' robots.txt files don't accidentally block important content.
2. Pre-Launch Checks
Ensure robots.txt is properly configured before launching a new site.
3. Competitor Analysis
Compare robots.txt configurations across competitor sites.
4. Security Compliance
Check for security.txt and ensure proper crawler access controls.
Quick Start Examples
Example 1: Single URL Analysis
{"url": "https://example.com","includeAIRecommendations": true}
Example 2: Batch Analysis with All Checks
{"urls": ["https://yoursite.com","https://competitor1.com","https://competitor2.com"],"includeSitemapCheck": true,"includeLlmsTxtCheck": true,"includeSecurityTxtCheck": true}
Example 3: Demo Mode (Free Testing)
{"demoMode": true}
Example 4: With AI Enhancement (BYOK)
{"url": "https://example.com","includeAIRecommendations": true,"anthropicApiKey": "sk-ant-..."}
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
demoMode | boolean | No | false | Run with sample data (free, no URL fetching) |
url | string | No* | - | Single URL to analyze |
urls | array | No* | - | Array of URLs to analyze |
includeSitemapCheck | boolean | No | true | Verify sitemap.xml exists |
includeLlmsTxtCheck | boolean | No | false | Check for llms.txt |
includeSecurityTxtCheck | boolean | No | false | Check for security.txt |
includeAIRecommendations | boolean | No | true | Generate AI recommendations |
anthropicApiKey | string | No | - | BYOK for enhanced AI recommendations |
webhookUrl | string | No | - | Webhook URL for integrations |
*Either url or urls required unless using demoMode
Output Format
{"url": "https://example.com","robotsTxtUrl": "https://example.com/robots.txt","timestamp": "2024-12-25T12:00:00.000Z","status": "found","score": 85,"rules": [{"userAgent": "*","disallow": ["/admin/", "/private/"],"allow": ["/admin/login"]}],"sitemaps": ["https://example.com/sitemap.xml"],"hasWildcardUserAgent": true,"syntaxErrors": [],"warnings": [],"bestPractices": {"hasSitemapDeclaration": true,"hasReasonableCrawlDelay": true,"blocksImportantPaths": [],"allowsSearchEngines": true},"detectedCms": "WordPress","cmsRecommendations": ["Consider adding Disallow: /wp-json/ to prevent REST API indexing"],"sitemapXml": {"exists": true,"url": "https://example.com/sitemap.xml","urlCount": 245},"llmsTxt": {"exists": false,"url": "https://example.com/llms.txt"},"securityTxt": {"exists": true,"url": "https://example.com/.well-known/security.txt","hasContact": true,"hasExpires": true},"recommendations": [{"priority": 1,"category": "cms_specific","issue": "WordPress optimization opportunity","recommendation": "Block /wp-json/ to prevent REST API indexing","impact": "medium"}]}
Scoring System
The actor calculates a 0-100 score based on:
| Factor | Impact |
|---|---|
| Syntax errors | -10 each (max -30) |
| Missing sitemap declaration | -15 |
| Unreasonable crawl delay (>10s) | -10 |
| Blocks important paths | -5 each |
| Blocks search engines | -20 |
| Has sitemap.xml | +5 (bonus) |
| Has llms.txt | +2 (bonus) |
| Has security.txt | +3 (bonus) |
AI Recommendations
Without Anthropic API Key
Uses rule-based recommendations based on:
- Detected CMS patterns
- Common SEO best practices
- Security standards
With Anthropic API Key (BYOK)
Enhanced analysis using Claude to:
- Identify subtle configuration issues
- Provide context-aware suggestions
- Prioritize recommendations by impact
CMS Detection
Detects these platforms by analyzing robots.txt patterns:
- WordPress - /wp-admin/, /wp-content/, /wp-includes/
- Shopify - /admin/, /cart/, /checkout/, /collections/
- Drupal - /node/, /user/, /sites/
- Joomla - /administrator/, /components/, /modules/
- Magento - /admin/, /checkout/, /customer/, /catalog/
- Wix - /_api/, /_files/, /_partials/
- Squarespace - /config/, /api/, /static/
Webhook Integration
Webhook Payload
{"event": "robots_txt_analysis_complete","timestamp": "2024-12-25T12:00:00.000Z","actor": "robots-txt-checker","status": "success","urlsAnalyzed": 3,"avgScore": 82,"results": [...]}
Perfect For
SEO Agencies
- Client onboarding audits
- Competitor analysis
- Pre-launch checklists
Web Developers
- CI/CD integration for robots.txt validation
- CMS migration checks
- Security compliance
Marketing Teams
- Ensure content is indexable
- Verify proper crawler access
Pricing
- Demo Mode: Free (sample data)
- Standard Usage: Apify compute units only
- AI Recommendations: Rule-based free, Claude BYOK for enhanced
Related Actors
- Technical SEO Auditor - Full on-page SEO analysis
- Sitemap Generator - Create valid sitemaps
- PageSpeed Intelligence - Performance + Tech Stack analysis
Built by John Rippy | johnrippy.link
Keywords
robots.txt checker, robots.txt analyzer, robots.txt validator, wordpress robots.txt, shopify robots.txt, seo audit, sitemap validation, llms.txt, security.txt, crawl directives, search engine crawler, googlebot, cms detection, technical seo, ai recommendations