Robots.txt Checker - CMS-Aware Analysis with AI Recommendations avatar
Robots.txt Checker - CMS-Aware Analysis with AI Recommendations

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Robots.txt Checker - CMS-Aware Analysis with AI Recommendations

Robots.txt Checker - CMS-Aware Analysis with AI Recommendations

The Robots.txt Checker provides comprehensive analysis of your robots.txt file: Syntax Validation CMS Detection - Identify WordPress, Shopify, Drupal,& 6+ other CMS platforms Best Practice Check Companion File Checks - sitemap.xml, llms.txt, security.txt AI Recommendations - CMS-specific suggestions

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

John Rippy

John Rippy

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

11 days ago

Last modified

Share

Robots.txt Checker with AI Recommendations

Analyze robots.txt files for syntax errors, best practices, and CMS-specific optimizations. Includes AI-powered recommendations, sitemap validation, and companion file checks (llms.txt, security.txt).

Features

  • Automated data collection
  • Structured output format
  • Error handling
  • Pay-per-event billing

Quick Start

{
"input": "your input here"
}

Demo Mode

Set demoMode: true to test with sample data (no charges). When you're ready for real results, set demoMode: false or omit it.

{
"demoMode": true,
...
}

Input Parameters

ParameterTypeRequiredDefaultDescription
demoModebooleanNofalseRun with sample data (free, no URL fetching)
urlstringNo*-Single URL to analyze
urlsarrayNo*-Array of URLs to analyze
includeSitemapCheckbooleanNotrueVerify sitemap.xml exists
includeLlmsTxtCheckbooleanNofalseCheck for llms.txt
includeSecurityTxtCheckbooleanNofalseCheck for security.txt
includeAIRecommendationsbooleanNotrueGenerate AI recommendations
anthropicApiKeystringNo-BYOK for enhanced AI recommendations
webhookUrlstringNo-Webhook URL for integrations

*Either url or urls required unless using demoMode


Output Format

{
"url": "https://example.com",
"robotsTxtUrl": "https://example.com/robots.txt",
"timestamp": "2024-12-25T12:00:00.000Z",
"status": "found",
"score": 85,
"rules": [
{
"userAgent": "*",
"disallow": ["/admin/", "/private/"],
"allow": ["/admin/login"]
}
],
"sitemaps": ["https://example.com/sitemap.xml"],
"hasWildcardUserAgent": true,
"syntaxErrors": [],
"warnings": [],
"bestPractices": {
"hasSitemapDeclaration": true,
"hasReasonableCrawlDelay": true,
"blocksImportantPaths": [],
"allowsSearchEngines": true
},
"detectedCms": "WordPress",
"cmsRecommendations": [
"Consider adding Disallow: /wp-json/ to prevent REST API indexing"
],
"sitemapXml": {
"exists": true,
"url": "https://example.com/sitemap.xml",
"urlCount": 245
},
"llmsTxt": {
"exists": false,
"url": "https://example.com/llms.txt"
},
"securityTxt": {
"exists": true,
"url": "https://example.com/.well-known/security.txt",
"hasContact": true,
"hasExpires": true
},
"recommendations": [
{
"priority": 1,
"category": "cms_specific",
"issue": "WordPress optimization opportunity",
"recommendation": "Block /wp-json/ to prevent REST API indexing",
"impact": "medium"
}
]
}

Pricing

This actor uses pay-per-event billing:

  • Demo Mode: Free (sample data)
  • Standard Usage: Apify compute units only
  • AI Recommendations: Rule-based free, Claude BYOK for enhanced

Use Cases

1. SEO Audits

Verify clients' robots.txt files don't accidentally block important content.

2. Pre-Launch Checks

Ensure robots.txt is properly configured before launching a new site.

3. Competitor Analysis

Compare robots.txt configurations across competitor sites.

4. Security Compliance

Check for security.txt and ensure proper crawler access controls.



Common Problems & Solutions

"Invalid API key" error

Cause: Your API key is wrong, expired, or doesn't have the right permissions. Fix: Double-check your API key. Make sure you copied it exactly without extra spaces.

"Rate limit exceeded" error

Cause: You've hit the API's rate limits. Fix: Wait a few minutes, then try again. Consider reducing the number of concurrent requests.

Empty or incomplete results

Cause: The target may have anti-scraping protection or the data doesn't exist. Fix:

  • Check if the URL/search query is correct
  • Try with different parameters
  • Some sites may block automated access

Demo data showing instead of real results

Cause: demoMode is still set to true. Fix: Set demoMode: false and provide your API key(s).


Built by John Rippy | Actor Arsenal