Apify Actor Discoverability & SEO Audit avatar

Apify Actor Discoverability & SEO Audit

Pricing

from $0.0425 / audit completed

Go to Apify Store
Apify Actor Discoverability & SEO Audit

Apify Actor Discoverability & SEO Audit

Audit an Apify Actor for discoverability by LLM agents and Apify Store ranking. Scores description quality, schema, disambiguation, agentic-pay, and live agent-search retrieval across 9 dimensions, returning copy-paste-ready fixes so AI agents can find and call it as a tool.

Pricing

from $0.0425 / audit completed

Rating

0.0

(0)

Developer

Scott Helvick

Scott Helvick

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

2

Monthly active users

18 hours ago

Last modified

Share

Audit an Apify Actor for AI-agent discoverability, Store ranking, and SEO against an evidence-grounded rubric — and make it easier for AI agents to find. Returns a punch list of failed checks across 9 dimensions (description quality, schema, disambiguation, README, store hygiene, a live probe of whether the Actor surfaces in agent-facing Actor search, and more), plus copy-paste-ready improvement text for everything that didn't pass. Useful before publishing a new Actor or before requesting agentic-payment whitelisting.

What this does

Submit one Actor identifier (username/name or 17-char ID). The Actor fetches the target's Store metadata, its latest build (full README + parsed input schema + dataset schema), and its public Store row, then runs a 9-dimension rubric: 8 deterministic checks grounded in 2026 peer-reviewed research on agent tool selection, plus a live probe of whether the Actor actually surfaces in the agent-facing Actor search.

What you get back, per run:

  • Overall verdictPASS / WARN / FAIL with the agentic-payment hard-gate flag broken out separately
  • Per-dimension category summary — one status per dimension (agentic-payment eligibility, description quality, schema completeness, disambiguation, tool name, README structure, store hygiene, off-platform canonical signals, live agent retrieval)
  • Full punch list — every individual check with its status, a short evidence line, and the dimension it belongs to
  • Suggestions — one copy-paste-ready improvement text per failed/warned check, written to the rubric concepts (six-component description framework, 500-char MCP truncation, 13-section README template, disambiguation block, PAY_PER_EVENT pricing)
  • Live retrieval summary — where the Actor ranked in the agent-facing Actor search for its own name and for agent-intent keywords, plus external MCP-registry presence (the agent_retrieval dimension). Diagnostic: its info-level signals are shown but don't demote the verdict; only a name that doesn't resolve, or a fixable under-naming gap, does.
  • Warnings — any data-collection issues (private Actor, build endpoint outage, unreachable canonical page) surfaced separately from the rubric verdicts

Two outputs per successful run:

  • One dataset record with the full structured shape — the schema is on the Console's Dataset tab
  • A Markdown punch list at OUTPUT.md in the run's Key-Value Store — same data, rendered as a readable report

Common workflows this enables:

  • Self-audit your Actor before pushing it public for the first time
  • Audit before requesting agentic-payment whitelisting (the rubric's hard gate)
  • Catch description-quality regressions across edits
  • Spot Apify build-validator gotchas (e.g. items.enum) before the push fails
  • Survey a portfolio of Actors and rank them by discoverability debt

Why agent-discoverability matters

AI agents pick tools by reading their descriptions and schemas, not by clicking through to a documentation site. Wang et al. 2026 measured a 260% lift in selection from standards-compliant tool descriptions; the "Tool Preferences in Agentic LLMs" paper found >10x usage swings from description edits alone, across multiple model families.

Apify's MCP server truncates input-schema property descriptions at 500 chars before appending enum values and examples. Tools whose first 500 chars don't pre-load purpose, parameters, and constraints fail at the routing layer. Tools that miss explicit disambiguation guidance ("use [other tool] instead for [other need]") lose to similar-sounding alternatives by default. Tools that aren't on the PAY_PER_EVENT + agentic-payment whitelist are invisible to agents using x402 or Skyfire payment rails.

The rubric this Actor scores is the executable form of that evidence. Pass it and your Actor competes; fail it and the agent never reaches you.

How it compares

ApproachDescription-quality rubricSchema-completeness checksBuild-validator gotchas (items.enum, etc.)MCP-truncation awarenessDisambiguation checkSuggestion text
Eyeballing your own Actor against the docspartial
Running each schema field through a generic JSON Schema linterpartial
Asking an LLM "is my Actor discoverable" with the README pasted inpartialpartialpartial
Apify Discoverability Audit

The audit reads what an agent's MCP client would read — the Store row, the latest build's input schema and README — and grades it against the same rubric the writer (the apify-actor-copy skill) uses, then probes the live agent-facing Actor search to see whether the Actor actually surfaces. Most checks are deterministic; the LLM writes the suggestion prose and generates the retrieval probe's intent keywords (it never sets a verdict).

Input

FieldTypeRequiredDefaultDescription
actorstringApify Actor to audit. Accepts username/name (e.g. shelvick/smart-page-fetcher, apify/website-content-crawler) or a 17-char Actor ID.
canonical_urlstring(auto-derived for shelvick/*)Optional canonical-page URL for the target. When provided, the audit fetches it and checks for a Store back-link. Auto-derived to https://www.scotthelvick.com/tools/<name>/ for the shelvick/ namespace.
include_suggestionsbooleantrueWhen true, the LLM writes copy-paste-ready suggestion text for failed and warned checks. Set false to skip the suggestion LLM call — the deterministic punch list still ships.
include_retrieval_probebooleantrueWhen true, runs the live agent_retrieval dimension: queries the agent-facing Actor search for the Actor's name and intent keywords and checks external MCP-registry presence. Generates its probe keywords with an LLM by default (deterministic fallback when no model is configured). Set false to skip it and its keyword call.
retrieval_keywordsarray of strings(auto)Optional explicit intent phrases to probe (e.g. "url to markdown", "bypass cloudflare"). Overrides keyword generation entirely. Ignored when include_retrieval_probe is false.

Output

One dataset record per run, plus a Markdown punch list at OUTPUT.md in the run's Key-Value Store.

Abbreviated success record:

{
"actor": "shelvick/smart-page-fetcher",
"audited_at": "2026-05-26T14:31:09Z",
"overall_status": "PASS",
"agentic_payment_eligible": true,
"category_summary": {
"agentic_payment": "PASS",
"description_quality": "PASS",
"schema_completeness": "PASS",
"disambiguation": "PASS",
"tool_name": "PASS",
"readme_structure": "PASS",
"store_hygiene": "PASS",
"canonical_signals": "PASS",
"agent_retrieval": "PASS"
},
"checks": [
{
"check_id": "description_quality.actor_description_length",
"dimension": "description_quality",
"status": "pass",
"title": "Actor description within 100-2000 char band",
"evidence": "description is 432 chars"
}
],
"suggestions": [],
"canonical_url_checked": "https://www.scotthelvick.com/tools/smart-page-fetcher/",
"retrieval_summary": {
"name_resolution": {"keyword": "Smart Page Fetcher", "rank": 1, "top_competitors": []},
"intent_coverage": [
{"keyword": "url to markdown", "rank": null, "top_competitors": ["maged120/url-to-markdown-pro"]}
],
"external_registry": {"registry": "smithery", "found": false},
"keyword_source": "llm",
"available": true
},
"external_registry_presence": {"registry": "smithery", "found": false},
"warnings": []
}

The full schema is documented on the Console's Dataset tab. The Markdown report at OUTPUT.md renders the same data as a prose punch list with the verdict, per-dimension table, full check list grouped by dimension, and suggestions section.

Example

{
"actor": "shelvick/smart-page-fetcher",
"include_suggestions": true
}

Via the API:

curl -X POST "https://api.apify.com/v2/acts/shelvick~apify-discoverability-audit/run-sync-get-dataset-items?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"actor": "shelvick/smart-page-fetcher"}'

A typical audit returns in 10-40 seconds — fast enough for the synchronous endpoint.

Calling from an AI agent

The Actor is designed for agent discovery and invocation.

MCP (mcp.apify.com): surfaces as a callable tool. The input schema is self-documenting — one required field (actor), structured output, no follow-up questions. An LLM can construct correct calls from the tool description without external context. Pay per call via the Actor's pay-per-event model — works with x402 and Skyfire agentic-payment rails.

Apify SDK (Python):

from apify_client import ApifyClient
client = ApifyClient(token=API_TOKEN)
run = client.actor("shelvick/apify-discoverability-audit").call(
run_input={"actor": "shelvick/smart-page-fetcher"}
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["overall_status"], item["category_summary"])

REST: /run-sync-get-dataset-items for the typical 10-30 second run, the async /runs endpoint for very large targets (uncommon — audit work doesn't scale with target size).

Pricing

Pay-per-event. Flat rate: a single audit-completed charge per successful run. The platform-startup charge fires once regardless of outcome but is effectively zero. Failed runs (invalid input, target Actor unreachable) skip the audit charge entirely.

Setting include_suggestions=false or include_retrieval_probe=false skips the respective LLM call, but the audit charge is the same — the rubric work is the costed unit.

See the Pricing tab on this Store page for the current per-event rate and any active subscriber discounts.

Behavior

Failure modes

The run is marked FAILED only on input validation:

  • actor empty or below 3 chars
  • actor not matching username/name or a 17-char ID
  • The target Actor's /v2/acts/<slug> endpoint returns 404 / 403 / 500

When the target fetch fails, no audit charge fires.

Per-source data-collection issues don't fail the run; they degrade silently and add entries to the warnings array. Common warnings:

  • pricing-data-unavailable — pricingInfos was absent from the target's Actor object (we don't own the Actor, or it's unpriced) AND no public Store row matched. The agentic-payment dimension warns rather than fails when this happens (eligibility can't be confirmed from outside), and agentic_payment_eligible is reported as null.
  • build-fetch-failed: <detail> — the latest build endpoint returned an error. README, input schema, and dataset schema fall back to empty; the dimensions that depend on them report failures.
  • canonical-non-200: <url> → <code> — the canonical page fetch returned 4xx or 5xx. The canonical-signals dimension warns rather than fails.
  • inputSchema-parse-failed: <detail> — the build endpoint returned an unparseable inputSchema string. Schema-completeness checks fall back as if the schema were absent.

Performance expectations

10-40 seconds typical wall-clock. Three sequential Apify API calls (Actor object, default build, Store search), zero or one canonical-page fetch, the live retrieval probe (a handful of agent-search queries plus an external-registry check, bounded by a 120-second budget), and up to two small LLM calls (intent-keyword generation and suggestion synthesis). No browser tier, no proxy, no recursive fetching.

The default run-sync-get-dataset-items endpoint (180-second sync cap) handles every audit comfortably.

FAQ

What's the rubric grounded in? A 2026-05-26 forge:researcher brief synthesizing peer-reviewed agent-tool-selection research: Wang et al. 2026 on description-quality lift, "Tool Preferences in Agentic LLMs" on usage-swings from description edits, BiasBusters on semantic-alignment as the dominant selection driver, and Anthropic engineering guidance on description quality plus disambiguation as the highest-impact surfaces. JSON-LD and llms.txt are NOT load-bearing per Search Atlas's 2025 300K-domain LLM-citation study and Limy's 2025 90-day bot-traffic log study — both showed null/negligible citation effect. Those signals are scored as hygiene only.

Can I audit a private Actor before publishing? Yes for Actors you own. The Apify token injected by the platform gives the audit access to your own private Actors, including pricingInfos and agentic-payment whitelist status (the Actor-object endpoint exposes these for owned Actors regardless of publish state). For other users' public Actors, the audit falls back to the Store search; if that doesn't find them (newly-published Actors sometimes don't index), the agentic-payment dimension reports WARN and agentic_payment_eligible is null — unconfirmed, not disqualifying.

Why does the LLM only write suggestion text? Every verdict (pass/warn/fail) comes from deterministic checks against the fetched metadata. LLM-as-scorer on subjective categories produces unauditable results. The LLM here synthesizes copy-paste-ready prose for already-failed checks and generates the retrieval probe's search keywords — but it can't invent dimensions or change verdicts. The agent_retrieval ranking and its verdict are computed deterministically from the live search results; the LLM only chooses what to query.

What if I disagree with a check's verdict? The check_ids are stable strings (e.g. description_quality.actor_description_length). Inspect the source — every check is a pure function in the public Actor repo. Open an issue or PR if the rubric is wrong; the standards live in actorlib.discoverability_standards for portability.

Does the audit charge run on failure? No. The audit-completed charge fires only on a successful audit. Input-validation failures and target-fetch failures skip it entirely. The platform-startup charge fires regardless but is effectively zero.

What this doesn't do

  • No batch input. One Actor per run. Run several and diff the dataset records offline.
  • No automatic fixing. The audit reports what to change and writes suggestion prose; it doesn't push edits back to your Actor.
  • No end-to-end agent simulation. The agent_retrieval dimension probes the live agent-facing Actor search to see whether and where the Actor surfaces, but the audit doesn't run a full agent selection-and-execution loop or measure real agent conversions. Live rankings shift over time, so treat that dimension as a point-in-time diagnostic.
  • No subjective taste. Brand voice, marketing positioning, persona fit — none of that is graded. The rubric is mechanical.
  • No private data inspection. Only the public Apify API and (if you opt in) the canonical page URL you provide.

For one-shot improvement of an existing Actor's customer-facing copy, use the apify-actor-copy writing skill instead — it writes against the same standards this Actor audits against. For ongoing tracking of a portfolio's discoverability over time, run this audit on a cron and diff the dataset records. For competitive landscape analysis against other published Actors, use a dedicated Store-search workflow.


Design notes: scotthelvick.com/tools/apify-discoverability-audit