robots.txt AI Policy Monitor | GPTBot ClaudeBot avatar

robots.txt AI Policy Monitor | GPTBot ClaudeBot

Pricing

from $11.00 / 1,000 results

Go to Apify Store
robots.txt AI Policy Monitor | GPTBot ClaudeBot

robots.txt AI Policy Monitor | GPTBot ClaudeBot

Detect GPTBot, ClaudeBot, Google-Extended, and other AI crawler policies in robots.txt, then monitor policy shifts over time.

Pricing

from $11.00 / 1,000 results

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Categories

Share

Detect AI crawler block policies in robots.txt and monitor policy changes over time.

Store Quickstart

  • Start with store-input.example.json. It uses demoMode=true so the first Store run is safe, cheap, and easy to understand.
  • If the compact output is useful, switch to store-input.templates.json and pick one of:
  • Demo Quickstart for a trial run
  • Production Monitor for recurring dataset snapshots
  • Webhook Alert for policy-change notifications

Output First: What You Get

Each checked domain returns policy summary plus crawler-level status.

{
"domain": "nytimes.com",
"status": "ok",
"summary": {
"totalCrawlers": 16,
"blocked": 8,
"partialBlock": 2,
"allowed": 6
},
"aiPolicies": [
{
"crawler": "GPTBot",
"company": "OpenAI",
"blocked": true,
"partialBlock": false,
"allowed": false
}
],
"changes": [
{
"crawler": "PerplexityBot",
"type": "allowed_to_blocked",
"from": "allowed",
"to": "blocked"
}
]
}

A fuller ready-to-share payload is available in sample-output.example.json for Store and README proof.

Input Examples

Demo run (safe trial):

{
"domains": ["openai.com", "google.com"],
"demoMode": true
}

Production run:

{
"domains": ["nytimes.com", "github.com", "openai.com"],
"delivery": "webhook",
"webhookUrl": "https://example.com/apify/webhook",
"concurrency": 5,
"demoMode": false
}

Demo Mode (Conversion-Friendly, Non-Abusive)

When demoMode=true:

  • Domain count is capped at 1
  • Output is compact (limited policy fields)
  • Webhook delivery is disabled (dataset only)
  • Snapshot writes are disabled

Upgrade signal is included in meta.upgradeHint so users can unlock bulk monitoring.

AI Crawlers Covered

Includes GPTBot, ChatGPT-User, OAI-SearchBot, Google-Extended, ClaudeBot, Claude-Web, CCBot, Bytespider, PerplexityBot, and more.

Common Use Cases

  • SEO teams: AI crawler policy audits
  • Publishers: policy governance and change tracking
  • AI companies: crawler access monitoring
  • Researchers: AI opt-out trend measurement

Cost Notes

  • Uses public robots.txt only
  • No external API or proxy dependency required

Commercial Ops

Set up .env:

$cp -n .env.example .env

Cloud task/schedule setup:

$npm run apify:cloud:setup

Live checks:

npm run canary:check
npm run contract:test:live

OpenClaw cron examples:

  • openclaw-cron-commands.md
  • ai-visibility-monitor-actor — track whether AI visibility shifts after crawl-policy changes.
  • sitemap-analyzer — inspect sitemap and crawl structure alongside robots policy.
  • structured-data-validator — validate on-page structured data after reviewing AI crawler access.