Structured Data Validator + Rich Results & AEO Checker avatar

Structured Data Validator + Rich Results & AEO Checker

Pricing

Pay per usage

Go to Apify Store
Structured Data Validator + Rich Results & AEO Checker

Structured Data Validator + Rich Results & AEO Checker

Detect ALL schema.org structured data (JSON-LD + microdata) on any public page, validate each type against Google Rich Results requirements, and get a 0-100 validity score, per-type eligibility, AEO readiness signals, and concrete prop-level fixes.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Tommy G

Tommy G

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Structured Data Validator + Rich Results & AEO Checker (Apify Actor)

Give it any public page URL and it detects every schema.org structured-data type on the page (both JSON-LD and microdata), validates each type against Google Rich Results requirements, and tells you exactly what's missing to win a rich result — plus an AEO (Answer-Engine Optimization) readiness read for AI search. HTML-only (no headless browser), fast, cheap, deterministic.

Built for SEO audits at scale, rich-result eligibility monitoring, schema QA in CI, competitor structured-data teardown, and AEO readiness checks.

What it validates

Per-type Google REQUIRED + RECOMMENDED property coverage for:

  • Article / NewsArticle / BlogPosting, Product (merchant listing and product snippet), Offer / AggregateOffer, AggregateRating / Review, Recipe, Event, JobPosting, LocalBusiness (+ subtypes like Restaurant/Store), FAQPage, HowTo, BreadcrumbList, VideoObject, Organization, Course.

It encodes the real-world gotchas: HowTo is deprecated (detected for AEO but never a rich result), FAQPage rich results are deprecated (validated structurally only), Article & Organization have no Google-required props, Product splits into two experiences, conditional-required logic (offers ⇒ price), nested vs standalone ratings/reviews, BreadcrumbList needs ≥2 items, ratingValue must be in range, and microdata counts exactly like JSON-LD.

Input

{ "startUrls": [{ "url": "https://www.example.com/product/123" }], "maxConcurrency": 5, "maxPages": 100 }

maxPages capped at 200, maxConcurrency at 20.

Output — one STABLE record per URL (ok and error rows share the shape)

{
"status": "ok",
"requested_url": "...",
"final_url": "...",
"http_status": 200,
"found": true,
"complete": true,
"page_type": "rich-eligible",
"source": "json-ld",
"detected_types": ["Product", "Offer", "AggregateRating", "BreadcrumbList"],
"entities": [
{
"type": "Product",
"source": "json-ld",
"required_missing": [],
"recommended_missing": ["offers.shippingDetails", "offers.hasMerchantReturnPolicy"],
"rich_result_eligible": true,
"errors": []
}
],
"rich_result_eligible_types": ["Product", "BreadcrumbList"],
"validity_score": 88,
"aeo_signals": {
"has_faq": false,
"has_howto": false,
"has_breadcrumb": true,
"has_article_meta": false,
"answer_upfront": false
},
"fixes": [
"Product.offers.priceCurrency missing — add ISO 4217 code for merchant listing eligibility."
],
"extracted_at": "2026-06-04T..."
}

Field guide

  • detected_types — every distinct schema.org @type found (JSON-LD + microdata, @graph and nested types flattened in, deduped, first-seen order).
  • entities[] — one per recognized typed node: its required_missing, recommended_missing, rich_result_eligible, and hard errors (malformed date, rating out of range, non-numeric price, breadcrumb < 2 items, conditional-required miss).
  • rich_result_eligible_types — types where at least one entity fully passes Google's REQUIRED props for a live rich result.
  • validity_score — integer 0-100. REQUIRED-coverage dominates (80%), RECOMMENDED is bonus (20%), hard errors subtract. Capped at 79 until at least one type is rich-result eligible (so a page can't score "green" without a real, eligible type). 0 when nothing is found.
  • aeo_signalshas_faq, has_howto, has_breadcrumb, has_article_meta, answer_upfront (answer-first content ≤300 chars) for Answer-Engine Optimization.
  • fixes[] — concrete, prop-level remediation naming Type.property + why it matters.

found / complete semantics

  • found = true iff ≥1 entity exists with a recognized non-empty schema.org @type (JSON-LD or microdata). A page with only <title>/OpenGraph, or only empty {} / @type-less nodes ⇒ found = false (never overbills).
  • complete = true iff at least one type is rich_result_eligible. HowTo (deprecated) and bare FAQPage never make a page complete on their own.

Run locally / test

npm install
npm test # 60 unit tests on the pure validator (node:test) — rich-result rules + edge cases

Publish to Apify (account-holder's step)

$npm install -g apify-cli && apify login && apify push

Notes / safety

  • SSRF-guarded, robots-respecting, rate-limited, cost-capped (shared src/lib/actor_runner.js).
  • Stores only derived validation results — no raw page bodies.
  • HTML-only: client-rendered pages that inject JSON via JS return found:false with render_required:true. Core logic in src/validate.js (pure, deterministic, unit-tested).
  • Validation rules web-verified against developers.google.com/search/docs structured-data docs (June 2026).