Pricing

Pay per event

Public AI Crawler Policy Signal Agent

Analyze public robots.txt and llms.txt files for AI crawler allow/block policy evidence, LLM guidance files, stable hashes, and useful-result pricing.

Pricing

Pay per event

Rating

0.0

(0)

Developer

jack su

Actor stats

Bookmarked

Total users

Monthly active users

25 days ago

Last modified

What It Reads

One public site origin, robots.txt, llms.txt, or llms-full.txt URL per input.
robots.txt user-agent blocks for known AI crawler tokens such as GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, and similar public bot names.
Public llms.txt and optionally llms-full.txt files for headings, links, topics, preview text, and evidence URLs.

What It Does Not Do

It does not crawl pages, parse sitemap URLs, audit SEO metadata, run a browser, execute JavaScript, log in, fetch private pages, or inspect account areas.
It rejects private-network hosts, query strings, fragments, credentials, path parameters, sensitive account paths, and token-like path segments.
It does not decide legal permission. It only returns public policy evidence that a human or downstream agent can review.

Pricing Events

apify-actor-start: one tiny run-start event when configured in Apify.
useful-ai-crawler-policy-result: charged only for useful, new or changed AI crawler policy evidence.

Generic robots.txt files without AI-specific user-agent blocks and without llms.txt guidance are written as partial records and do not trigger the useful event. Unchanged hashes, invalid inputs, failed fetches, and missing policy evidence are also uncharged.

apify-default-dataset-item is intentionally not used.

Example Input

{
  "siteUrls": ["https://openai.com/"],
  "includeLlmsFullTxt": true,
  "requestTimeoutSecs": 15
}

Output Highlights

policyType
aiCrawlerPolicySignals
knownAiProviders
knownAiBotUserAgents
wildcardRobotsPolicy
llmsTxtSignals
riskLabels
diagnostics
aiCrawlerPolicyHash
changeStatus
billableEventName

LLMS.txt Auditor

junipr/llms-txt-auditor

Check for llms.txt / AI crawler policy guidance and produce structured readiness reports.

junipr

AI Bot & LLM Crawler Exposure Auditor

glowing_glove/ai-bot-llm-crawler-exposure-auditor

Audit robots.txt, llms.txt, AI crawler directives, public content exposure, and policy gaps for websites in the AI search era.

Ushba Khan

AI Crawler Readiness Auditor

nectared_screen/ai-crawler-readiness-auditor

Audit domains for llms.txt, robots.txt AI bot rules, sitemap hints, and public AI crawler readiness signals.

jaehan byun

llms.txt Generator — Make Any Website AI-Readable

darknezz/llms-txt-generator

Crawl any website and generate llms.txt and llms-full.txt files following the llms.txt standard. BFS crawler with URL filtering and markdown/plaintext output. For AI readiness, SEO for AI, documentation portals, and archiving.

Oaida Adrian

robots.txt Parser & AI Crawler Block Checker

taroyamada/robotstxt-ai-checker

robots.txt parser that audits AI crawler block rules (GPTBot, ClaudeBot, anthropic-ai, PerplexityBot) across thousands of websites in one run. Returns per-bot allow/disallow disposition and crawl-delay.

naoki anzai

Store/Crawler Surface Preflight

checksmithcats/store-crawler-surface-preflight

Cross-check public robots.txt, sitemap, noindex, llms.txt, policy links, and contact signals from a storefront URL. Reads only public URLs; no account access.

Checksmith Cats

AI llm.txt File Generator API

dev00/ai-llm-txt-file-generator-api

Generate llm.txt files automatically for any website. Map website directories and convert documentation into clean markdown structured llm.txt files for LLM agents.

dev00