Structured Data Validator (JSON-LD / OG)
Pricing
Pay per event
Structured Data Validator (JSON-LD / OG)
Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Structured Data Validator Pro (JSON-LD, Open Graph, Schema.org)
Extract and validate structured data from any URL — JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags — in one pass. Local schema.org validation, Google rich-result eligibility check, and an AI-discovery readiness score. Pure HTTP, no browser.
Structured Data Validator Features
- Extracts six structured-data formats per URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags.
- Validates JSON-LD blocks against a bundled schema.org rule set with required-field gates per type (Article, Recipe, Product, Event, FAQPage, HowTo, VideoObject).
- Flags Google rich-result eligibility — true when any block satisfies the relevant rich-result requirement set.
- Scores AI-discovery readiness on a 0-100 scale, weighted toward the signals LLM crawlers actually use.
- Detects and lists every schema.org
@typefound across all formats. - Optional raw-HTML dump to KVS for offline debugging.
- Pure HTTP fetch via CheerioCrawler — no browser, no proxy by default. The cheap default.
Who Uses Structured Data Audits?
- SEO teams — audit rich-result eligibility across a sitemap before chasing rank changes that turn out to be markup bugs.
- Content engineering — verify JSON-LD blocks ship with every article, product, or recipe page.
- AI / LLM-discovery auditors — score how well a site speaks to AI crawlers, since LLMs lean heavily on structured data.
- Migration QA — diff structured-data coverage before and after a CMS swap or template refactor.
- Competitive research — see exactly which schema.org types competitors mark up, and which ones they miss.
How Structured Data Validator Works
- Pass in a list of URLs. The actor caps at 15 per run by default to stay inside the Apify tester's 5-minute timeout.
- CoreCrawler fetches the static HTML over plain HTTP. The handler runs all six extractors in parallel.
- JSON-LD blocks are validated against the bundled schema.org rule set. Each issue is recorded with severity, path, type, and message.
- The actor flags Google rich-result eligibility and computes the AI-discovery readiness score, then emits one row per URL.
Input
{"urls": ["https://schema.org/Article","https://www.apify.com"],"maxItems": 5,"extractWhich": ["json-ld", "open-graph", "twitter-cards", "microdata", "rdfa", "meta-tags"],"validateAgainst": "schema.org","includeRawHtml": false}
| Field | Type | Default | Description |
|---|---|---|---|
urls | array | required | URLs to extract and validate structured data from. |
maxItems | integer | 5 | Hard cap on URLs per run. Range 1-15. |
extractWhich | array | all six | Formats to extract: json-ld, open-graph, twitter-cards, microdata, rdfa, meta-tags. |
validateAgainst | enum | schema.org | Validation rule set. schema.org runs the bundled gates; none skips validation. |
includeRawHtml | boolean | false | Save the fetched HTML to KVS and link via rawHtmlKvsKey on each row. |
proxyConfiguration | object | none | Optional. Default is no proxy. |
Structured Data Validator Output Fields
{"url": "https://www.apify.com","finalUrl": "https://www.apify.com/","jsonLd": ["{\"@context\":\"https://schema.org\",\"@type\":\"Organization\",\"name\":\"Apify\"}"],"openGraph": {"og:title": "Apify - The Web Scraping Platform","og:type": "website","og:url": "https://apify.com/","og:image": "https://apify.com/img/social.png"},"twitterCard": { "twitter:card": "summary_large_image" },"microdata": [],"rdfa": [],"metaTags": { "viewport": "width=device-width, initial-scale=1", "robots": "index, follow" },"validationErrors": [],"schemaTypes": ["Organization"],"googleRichResultEligible": false,"aiDiscoveryReadiness": {"hasJsonLd": true,"hasArticleSchema": false,"hasFAQ": false,"hasHowTo": false,"hasOpenGraph": true,"score": 60},"rawHtmlKvsKey": "","status": "success","errorMsg": "","extractedAt": "2026-04-30T12:00:00Z"}
| Field | Type | Description |
|---|---|---|
url | string | Audited URL. |
finalUrl | string | URL after redirects. |
jsonLd | array | Parsed JSON-LD blocks as JSON-stringified objects (CSV/Excel safe). |
openGraph | object | All og:* meta tags flattened into a single object. |
twitterCard | object | All twitter:* meta tags flattened into a single object. |
microdata | array | itemscope/itemtype blocks as JSON-stringified objects. |
rdfa | array | property/typeof/resource blocks as JSON-stringified objects. |
metaTags | object | All <meta name> and <meta http-equiv> tags as a flat object. |
validationErrors | array | Issues formatted as <severity> [<path>] (<type>) <message>. |
schemaTypes | array | Detected schema.org types (e.g. Article, Recipe, Product). |
googleRichResultEligible | boolean | True when any block satisfies a Google rich-result requirement set. |
aiDiscoveryReadiness | object | {hasJsonLd, hasArticleSchema, hasFAQ, hasHowTo, hasOpenGraph, score 0-100}. |
rawHtmlKvsKey | string | KVS key for raw HTML when includeRawHtml=true (else empty). |
status | string | success, not_found, or error. |
errorMsg | string | Error message on failure (empty on success). |
extractedAt | string | ISO timestamp. |
Pricing
Token charge — functionally free. Apify rejects truly $0 PPE events, so the per-record price is the smallest practical floor.
| Event | Price |
|---|---|
| Actor start | $0.10 |
| Per record | $0.0001 |
| Volume | Cost |
|---|---|
| 100 records | $0.11 |
| 1,000 records | $0.20 |
| 10,000 records | $1.10 |
This actor is the cheap discovery primitive that pairs with paid downstream actors. Audit liberally.
Limits
maxItemscaps at 15 per run by default — sized for the Apify tester's 5-minute timeout.- The schema.org validator covers the common Google-rich-result types (Article, Recipe, Product, Event, FAQPage, HowTo, VideoObject). Other types parse but skip required-field validation.
- The actor uses HTTP fetch only. Sites that require JS rendering for structured data won't surface anything — pair with a render crawler upstream.
includeRawHtml=truewrites one KVS entry per URL. KVS quotas apply.- Validation severity is internal —
validationErrorsstrings start witherror,warn, orinfofor downstream filtering.
Related Actors
- Sitemap Walker Pro — feed discovered URLs straight into this validator for site-wide structured-data audits.
- SSL & Security Headers Checker — pair for full SEO + security audits per URL.
- Angular SSR State Extractor — for sites where the structured data lives inside Angular's TransferState payload.
Need More Features?
Need additional schema.org types, custom validation rules, or a render-crawler variant? File an issue or get in touch.
Why Use Structured Data Validator Pro?
- Functionally free — $0.0001 per record. Audit your whole sitemap and barely move the needle.
- Six formats, one pass — JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags in a single dataset row. Most tools cover one, maybe two.
- AI-discovery score baked in — rich-result eligibility plus an LLM-readiness score, so you know how the site reads to both Google and Claude.
Built by OrbTop.