Public AI Crawler Policy Signal Agent
Pricing
Pay per event
Go to Apify Store
Public AI Crawler Policy Signal Agent
Analyze public robots.txt and llms.txt files for AI crawler allow/block policy evidence, LLM guidance files, stable hashes, and useful-result pricing.
Analyze public site-level policy files for AI crawler and LLM-agent guidance:
explicit AI crawler rules in robots.txt, public llms.txt / llms-full.txt
files, stable policy hashes, diagnostics, and change-aware useful billing.
What It Reads
- One public site origin,
robots.txt,llms.txt, orllms-full.txtURL per input. robots.txtuser-agent blocks for known AI crawler tokens such asGPTBot,ChatGPT-User,OAI-SearchBot,ClaudeBot,PerplexityBot,Google-Extended,Applebot-Extended,CCBot, and similar public bot names.- Public
llms.txtand optionallyllms-full.txtfiles for headings, links, topics, preview text, and evidence URLs.
What It Does Not Do
- It does not crawl pages, parse sitemap URLs, audit SEO metadata, run a browser, execute JavaScript, log in, fetch private pages, or inspect account areas.
- It rejects private-network hosts, query strings, fragments, credentials, path parameters, sensitive account paths, and token-like path segments.
- It does not decide legal permission. It only returns public policy evidence that a human or downstream agent can review.
Pricing Events
apify-actor-start: one tiny run-start event when configured in Apify.useful-ai-crawler-policy-result: charged only for useful, new or changed AI crawler policy evidence.
Generic robots.txt files without AI-specific user-agent blocks and without
llms.txt guidance are written as partial records and do not trigger the useful
event. Unchanged hashes, invalid inputs, failed fetches, and missing policy
evidence are also uncharged.
apify-default-dataset-item is intentionally not used.
Example Input
{"siteUrls": ["https://openai.com/"],"includeLlmsFullTxt": true,"requestTimeoutSecs": 15}
Output Highlights
policyTypeaiCrawlerPolicySignalsknownAiProvidersknownAiBotUserAgentswildcardRobotsPolicyllmsTxtSignalsriskLabelsdiagnosticsaiCrawlerPolicyHashchangeStatusbillableEventName