Apify Discoverability Audit
Pricing
from $0.0425 / audit completed
Apify Discoverability Audit
Audits an Apify Actor for AI-agent discoverability against an evidence-grounded rubric. Scores 8 dimensions (agentic-pay eligibility, description quality, schema completeness, disambiguation, tool name, README, store hygiene, canonical signals) and returns a punch list with copy-paste-ready fixes.
Pricing
from $0.0425 / audit completed
Rating
0.0
(0)
Developer
Scott Helvick
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
4 minutes ago
Last modified
Categories
Share
Audits an Apify Actor for AI-agent discoverability against an evidence-grounded rubric. Returns a punch list of failed checks across 8 dimensions, plus copy-paste-ready improvement text for everything that didn't pass. Useful before publishing a new Actor or before requesting agentic-payment whitelisting.
What this does
Submit one Actor identifier (username/name or 17-char ID). The Actor fetches the target's Store metadata, its latest build (full README + parsed input schema + dataset schema), and its public Store row, then runs a deterministic 8-dimension rubric grounded in 2026 peer-reviewed research on agent tool selection.
What you get back, per run:
- Overall verdict —
PASS/WARN/FAILwith the agentic-payment hard-gate flag broken out separately - Per-dimension category summary — one status per dimension (agentic-payment eligibility, description quality, schema completeness, disambiguation, tool name, README structure, store hygiene, off-platform canonical signals)
- Full punch list — every individual check with its status, a short evidence line, and the dimension it belongs to
- Suggestions — one copy-paste-ready improvement text per failed/warned check, written to the rubric concepts (six-component description framework, 500-char MCP truncation, 13-section README template, disambiguation block, PAY_PER_EVENT pricing)
- Warnings — any data-collection issues (private Actor, build endpoint outage, unreachable canonical page) surfaced separately from the rubric verdicts
Two outputs per successful run:
- One dataset record with the full structured shape — the schema is on the Console's Dataset tab
- A Markdown punch list at
OUTPUT.mdin the run's Key-Value Store — same data, rendered as a readable report
Common workflows this enables:
- Self-audit your Actor before pushing it public for the first time
- Audit before requesting agentic-payment whitelisting (the rubric's hard gate)
- Catch description-quality regressions across edits
- Spot Apify build-validator gotchas (e.g.
items.enum) before the push fails - Survey a portfolio of Actors and rank them by discoverability debt
Why agent-discoverability matters
AI agents pick tools by reading their descriptions and schemas, not by clicking through to a documentation site. Wang et al. 2026 measured a 260% lift in selection from standards-compliant tool descriptions; the "Tool Preferences in Agentic LLMs" paper found >10x usage swings from description edits alone, across multiple model families.
Apify's MCP server truncates input-schema property descriptions at 500 chars before appending enum values and examples. Tools whose first 500 chars don't pre-load purpose, parameters, and constraints fail at the routing layer. Tools that miss explicit disambiguation guidance ("use [other tool] instead for [other need]") lose to similar-sounding alternatives by default. Tools that aren't on the PAY_PER_EVENT + agentic-payment whitelist are invisible to agents using x402 or Skyfire payment rails.
The rubric this Actor scores is the executable form of that evidence. Pass it and your Actor competes; fail it and the agent never reaches you.
How it compares
| Approach | Description-quality rubric | Schema-completeness checks | Build-validator gotchas (items.enum, etc.) | MCP-truncation awareness | Disambiguation check | Suggestion text |
|---|---|---|---|---|---|---|
| Eyeballing your own Actor against the docs | ✗ | partial | ✗ | ✗ | ✗ | ✗ |
| Running each schema field through a generic JSON Schema linter | ✗ | partial | ✗ | ✗ | ✗ | ✗ |
| Asking an LLM "is my Actor discoverable" with the README pasted in | partial | ✗ | ✗ | ✗ | partial | partial |
| Apify Discoverability Audit | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
The audit reads what an agent's MCP client would read — the Store row, the latest build's input schema and README — and grades it against the same rubric the writer (the apify-actor-copy skill) uses. The deterministic checks are the value; the LLM only writes the suggestion prose.
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
actor | string | ✓ | — | Apify Actor to audit. Accepts username/name (e.g. shelvick/smart-page-fetcher, apify/website-content-crawler) or a 17-char Actor ID. |
canonical_url | string | (auto-derived for shelvick/*) | Optional canonical-page URL for the target. When provided, the audit fetches it and checks for a Store back-link. Auto-derived to https://www.scotthelvick.com/tools/<name>/ for the shelvick/ namespace. | |
include_suggestions | boolean | true | When true, the LLM writes copy-paste-ready suggestion text for failed and warned checks. Set false to skip the LLM call entirely — the deterministic punch list still ships. |
Output
One dataset record per run, plus a Markdown punch list at OUTPUT.md in the run's Key-Value Store.
Abbreviated success record:
{"actor": "shelvick/smart-page-fetcher","audited_at": "2026-05-26T14:31:09Z","overall_status": "PASS","agentic_payment_eligible": true,"category_summary": {"agentic_payment": "PASS","description_quality": "PASS","schema_completeness": "PASS","disambiguation": "PASS","tool_name": "PASS","readme_structure": "PASS","store_hygiene": "PASS","canonical_signals": "PASS"},"checks": [{"check_id": "description_quality.actor_description_length","dimension": "description_quality","status": "pass","title": "Actor description within 100-2000 char band","evidence": "description is 432 chars"}],"suggestions": [],"canonical_url_checked": "https://www.scotthelvick.com/tools/smart-page-fetcher/","warnings": []}
The full schema is documented on the Console's Dataset tab. The Markdown report at OUTPUT.md renders the same data as a prose punch list with the verdict, per-dimension table, full check list grouped by dimension, and suggestions section.
Example
{"actor": "shelvick/smart-page-fetcher","include_suggestions": true}
Via the API:
curl -X POST "https://api.apify.com/v2/acts/shelvick~apify-discoverability-audit/run-sync-get-dataset-items?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"actor": "shelvick/smart-page-fetcher"}'
A typical audit returns in 10-30 seconds — fast enough for the synchronous endpoint.
Calling from an AI agent
The Actor is designed for agent discovery and invocation.
MCP (mcp.apify.com): surfaces as a callable tool. The input schema is self-documenting — one required field (actor), structured output, no follow-up questions. An LLM can construct correct calls from the tool description without external context. Pay per call via the Actor's pay-per-event model — works with x402 and Skyfire agentic-payment rails.
Apify SDK (Python):
from apify_client import ApifyClientclient = ApifyClient(token=API_TOKEN)run = client.actor("shelvick/apify-discoverability-audit").call(run_input={"actor": "shelvick/smart-page-fetcher"})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["overall_status"], item["category_summary"])
REST: /run-sync-get-dataset-items for the typical 10-30 second run, the async /runs endpoint for very large targets (uncommon — audit work doesn't scale with target size).
Pricing
Pay-per-event. Flat rate: a single audit-completed charge per successful run. The platform-startup charge fires once regardless of outcome but is effectively zero. Failed runs (invalid input, target Actor unreachable) skip the audit charge entirely.
Setting include_suggestions=false skips the LLM call but the audit charge is the same — the deterministic punch list is the costed work.
See the Pricing tab on this Store page for the current per-event rate and any active subscriber discounts.
Behavior
Failure modes
The run is marked FAILED only on input validation:
actorempty or below 3 charsactornot matchingusername/nameor a 17-char ID- The target Actor's
/v2/acts/<slug>endpoint returns 404 / 403 / 500
When the target fetch fails, no audit charge fires.
Per-source data-collection issues don't fail the run; they degrade silently and add entries to the warnings array. Common warnings:
pricing-data-unavailable— pricingInfos was absent from the target's Actor object (we don't own the Actor, or it's unpriced) AND no public Store row matched. The agentic-payment dimension warns rather than fails when this happens (eligibility can't be confirmed from outside), andagentic_payment_eligibleis reported asnull.build-fetch-failed: <detail>— the latest build endpoint returned an error. README, input schema, and dataset schema fall back to empty; the dimensions that depend on them report failures.canonical-non-200: <url> → <code>— the canonical page fetch returned 4xx or 5xx. The canonical-signals dimension warns rather than fails.inputSchema-parse-failed: <detail>— the build endpoint returned an unparseable inputSchema string. Schema-completeness checks fall back as if the schema were absent.
Performance expectations
10-30 seconds typical wall-clock. Three sequential Apify API calls (Actor object, default build, Store search) plus zero or one canonical-page fetch and zero or one LLM call. No browser tier, no proxy, no recursive fetching.
The default run-sync-get-dataset-items endpoint (180-second sync cap) handles every audit comfortably.
FAQ
What's the rubric grounded in? A 2026-05-26 forge:researcher brief synthesizing peer-reviewed agent-tool-selection research: Wang et al. 2026 on description-quality lift, "Tool Preferences in Agentic LLMs" on usage-swings from description edits, BiasBusters on semantic-alignment as the dominant selection driver, and Anthropic engineering guidance on description quality plus disambiguation as the highest-impact surfaces. JSON-LD and llms.txt are NOT load-bearing per Search Atlas's 2025 300K-domain LLM-citation study and Limy's 2025 90-day bot-traffic log study — both showed null/negligible citation effect. Those signals are scored as hygiene only.
Can I audit a private Actor before publishing?
Yes for Actors you own. The Apify token injected by the platform gives the audit access to your own private Actors, including pricingInfos and agentic-payment whitelist status (the Actor-object endpoint exposes these for owned Actors regardless of publish state). For other users' public Actors, the audit falls back to the Store search; if that doesn't find them (newly-published Actors sometimes don't index), the agentic-payment dimension reports WARN and agentic_payment_eligible is null — unconfirmed, not disqualifying.
Why does the LLM only write suggestion text? Every verdict (pass/warn/fail) comes from deterministic checks against the fetched metadata. LLM-as-scorer on subjective categories produces unauditable results. The LLM here only synthesizes copy-paste-ready prose for already-failed checks — it can't invent dimensions or change verdicts.
What if I disagree with a check's verdict?
The check_ids are stable strings (e.g. description_quality.actor_description_length). Inspect the source — every check is a pure function in the public Actor repo. Open an issue or PR if the rubric is wrong; the standards live in actorlib.discoverability_standards for portability.
Does the audit charge run on failure? No. The audit-completed charge fires only on a successful audit. Input-validation failures and target-fetch failures skip it entirely. The platform-startup charge fires regardless but is effectively zero.
What this doesn't do
- No batch input. One Actor per run. Run several and diff the dataset records offline.
- No automatic fixing. The audit reports what to change and writes suggestion prose; it doesn't push edits back to your Actor.
- No live MCP probing. The rubric grades the metadata an MCP agent would read, but the audit itself doesn't simulate an agent's selection flow end-to-end.
- No subjective taste. Brand voice, marketing positioning, persona fit — none of that is graded. The rubric is mechanical.
- No private data inspection. Only the public Apify API and (if you opt in) the canonical page URL you provide.
For one-shot improvement of an existing Actor's customer-facing copy, use the apify-actor-copy writing skill instead — it writes against the same standards this Actor audits against. For ongoing tracking of a portfolio's discoverability over time, run this audit on a cron and diff the dataset records. For competitive landscape analysis against other published Actors, use a dedicated Store-search workflow.
Design notes: scotthelvick.com/tools/apify-discoverability-audit