Structured Data Validator (JSON-LD / OG) avatar

Structured Data Validator (JSON-LD / OG)

Pricing

Pay per event

Go to Apify Store
Structured Data Validator (JSON-LD / OG)

Structured Data Validator (JSON-LD / OG)

Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Structured Data Validator Pro (JSON-LD, Open Graph, Schema.org)

Extract and validate structured data from any URL — JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags — in one pass. Local schema.org validation, Google rich-result eligibility check, and an AI-discovery readiness score. Pure HTTP, no browser.


Structured Data Validator Features

  • Extracts six structured-data formats per URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags.
  • Validates JSON-LD blocks against a bundled schema.org rule set with required-field gates per type (Article, Recipe, Product, Event, FAQPage, HowTo, VideoObject).
  • Flags Google rich-result eligibility — true when any block satisfies the relevant rich-result requirement set.
  • Scores AI-discovery readiness on a 0-100 scale, weighted toward the signals LLM crawlers actually use.
  • Detects and lists every schema.org @type found across all formats.
  • Optional raw-HTML dump to KVS for offline debugging.
  • Pure HTTP fetch via CheerioCrawler — no browser, no proxy by default. The cheap default.

Who Uses Structured Data Audits?

  • SEO teams — audit rich-result eligibility across a sitemap before chasing rank changes that turn out to be markup bugs.
  • Content engineering — verify JSON-LD blocks ship with every article, product, or recipe page.
  • AI / LLM-discovery auditors — score how well a site speaks to AI crawlers, since LLMs lean heavily on structured data.
  • Migration QA — diff structured-data coverage before and after a CMS swap or template refactor.
  • Competitive research — see exactly which schema.org types competitors mark up, and which ones they miss.

How Structured Data Validator Works

  1. Pass in a list of URLs. The actor caps at 15 per run by default to stay inside the Apify tester's 5-minute timeout.
  2. CoreCrawler fetches the static HTML over plain HTTP. The handler runs all six extractors in parallel.
  3. JSON-LD blocks are validated against the bundled schema.org rule set. Each issue is recorded with severity, path, type, and message.
  4. The actor flags Google rich-result eligibility and computes the AI-discovery readiness score, then emits one row per URL.

Input

{
"urls": [
"https://schema.org/Article",
"https://www.apify.com"
],
"maxItems": 5,
"extractWhich": ["json-ld", "open-graph", "twitter-cards", "microdata", "rdfa", "meta-tags"],
"validateAgainst": "schema.org",
"includeRawHtml": false
}
FieldTypeDefaultDescription
urlsarrayrequiredURLs to extract and validate structured data from.
maxItemsinteger5Hard cap on URLs per run. Range 1-15.
extractWhicharrayall sixFormats to extract: json-ld, open-graph, twitter-cards, microdata, rdfa, meta-tags.
validateAgainstenumschema.orgValidation rule set. schema.org runs the bundled gates; none skips validation.
includeRawHtmlbooleanfalseSave the fetched HTML to KVS and link via rawHtmlKvsKey on each row.
proxyConfigurationobjectnoneOptional. Default is no proxy.

Structured Data Validator Output Fields

{
"url": "https://www.apify.com",
"finalUrl": "https://www.apify.com/",
"jsonLd": [
"{\"@context\":\"https://schema.org\",\"@type\":\"Organization\",\"name\":\"Apify\"}"
],
"openGraph": {
"og:title": "Apify - The Web Scraping Platform",
"og:type": "website",
"og:url": "https://apify.com/",
"og:image": "https://apify.com/img/social.png"
},
"twitterCard": { "twitter:card": "summary_large_image" },
"microdata": [],
"rdfa": [],
"metaTags": { "viewport": "width=device-width, initial-scale=1", "robots": "index, follow" },
"validationErrors": [],
"schemaTypes": ["Organization"],
"googleRichResultEligible": false,
"aiDiscoveryReadiness": {
"hasJsonLd": true,
"hasArticleSchema": false,
"hasFAQ": false,
"hasHowTo": false,
"hasOpenGraph": true,
"score": 60
},
"rawHtmlKvsKey": "",
"status": "success",
"errorMsg": "",
"extractedAt": "2026-04-30T12:00:00Z"
}
FieldTypeDescription
urlstringAudited URL.
finalUrlstringURL after redirects.
jsonLdarrayParsed JSON-LD blocks as JSON-stringified objects (CSV/Excel safe).
openGraphobjectAll og:* meta tags flattened into a single object.
twitterCardobjectAll twitter:* meta tags flattened into a single object.
microdataarrayitemscope/itemtype blocks as JSON-stringified objects.
rdfaarrayproperty/typeof/resource blocks as JSON-stringified objects.
metaTagsobjectAll <meta name> and <meta http-equiv> tags as a flat object.
validationErrorsarrayIssues formatted as <severity> [<path>] (<type>) <message>.
schemaTypesarrayDetected schema.org types (e.g. Article, Recipe, Product).
googleRichResultEligiblebooleanTrue when any block satisfies a Google rich-result requirement set.
aiDiscoveryReadinessobject{hasJsonLd, hasArticleSchema, hasFAQ, hasHowTo, hasOpenGraph, score 0-100}.
rawHtmlKvsKeystringKVS key for raw HTML when includeRawHtml=true (else empty).
statusstringsuccess, not_found, or error.
errorMsgstringError message on failure (empty on success).
extractedAtstringISO timestamp.

Pricing

Token charge — functionally free. Apify rejects truly $0 PPE events, so the per-record price is the smallest practical floor.

EventPrice
Actor start$0.10
Per record$0.0001
VolumeCost
100 records$0.11
1,000 records$0.20
10,000 records$1.10

This actor is the cheap discovery primitive that pairs with paid downstream actors. Audit liberally.


Limits

  • maxItems caps at 15 per run by default — sized for the Apify tester's 5-minute timeout.
  • The schema.org validator covers the common Google-rich-result types (Article, Recipe, Product, Event, FAQPage, HowTo, VideoObject). Other types parse but skip required-field validation.
  • The actor uses HTTP fetch only. Sites that require JS rendering for structured data won't surface anything — pair with a render crawler upstream.
  • includeRawHtml=true writes one KVS entry per URL. KVS quotas apply.
  • Validation severity is internal — validationErrors strings start with error, warn, or info for downstream filtering.

  • Sitemap Walker Pro — feed discovered URLs straight into this validator for site-wide structured-data audits.
  • SSL & Security Headers Checker — pair for full SEO + security audits per URL.
  • Angular SSR State Extractor — for sites where the structured data lives inside Angular's TransferState payload.

Need More Features?

Need additional schema.org types, custom validation rules, or a render-crawler variant? File an issue or get in touch.

Why Use Structured Data Validator Pro?

  • Functionally free — $0.0001 per record. Audit your whole sitemap and barely move the needle.
  • Six formats, one pass — JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags in a single dataset row. Most tools cover one, maybe two.
  • AI-discovery score baked in — rich-result eligibility plus an LLM-readiness score, so you know how the site reads to both Google and Claude.

Built by OrbTop.