Pricing

$0.05 / 1,000 validated pages

Schema Markup Validator

Validate schema markup on public pages. Extract JSON-LD, Microdata, RDFa, Open Graph, Twitter Cards, meta tags, schema.org types, issue counts, and rich-result readiness signals.

Pricing

$0.05 / 1,000 validated pages

Rating

0.0

(0)

Developer

Maxime Dupré

Actor stats

Bookmarked

Total users

Monthly active users

23 days ago

Last modified

🔎 Schema markup validator for structured data

Schema Markup Validator checks public web pages for structured data and returns one clean page audit per successful URL. Add pages such as schema.org Article, choose whether to audit only submitted URLs or follow same-site links, and get JSON-LD, Microdata, RDFa, schema.org types, Open Graph, Twitter Cards, meta tags, validation issues, and rich-result readiness signals in the dataset.

Use this structured data validator when you need to debug rich-result markup, compare pages during an SEO audit, check JSON-LD syntax, or collect schema evidence before a release. The Actor runs on public pages and does not need source cookies, website credentials, source API keys, or a separate account from you.

✅ What this Actor does

Accepts public page URLs in a batch.
Can check only the submitted URLs or follow same-site links for a small site audit.
Extracts JSON-LD blocks and checks JSON syntax, schema.org context, and schema types.
Extracts Microdata and RDFa items when the audit focus includes schema.org markup.
Extracts Open Graph, Twitter Card, canonical, and core meta tag data in the full audit.
Reports detected schema.org types, structured-data counts, issue counts, and issue details.
Adds transparent rich-result readiness signals for common types such as Article, Product, Recipe, FAQPage, HowTo, Event, Organization, and LocalBusiness.
Saves one dataset row per successfully fetched page audit.

This Actor is focused on schema markup validation and structured-data extraction. It is not a Lighthouse audit, page-speed checker, broken-link crawler, sitemap indexability audit, or full technical SEO scanner.

📊 Data you get

Each dataset row is one successful page audit. Rows can include:

url, title, canonicalUrl, statusCode, contentType, and crawlDepth
schemaTypes found across JSON-LD, Microdata, and RDFa
markupSummary with counts for JSON-LD blocks, Microdata items, RDFa items, Open Graph properties, Twitter Card properties, and meta tags
validationStatus and issueCounts for quick filtering
issues with severity, code, message, format, schema type, property, and evidence
jsonLd parsed blocks with validity, context, detected types, source data, and block-level issues
microdata and rdfa extracted items with types, properties, and item issues
metadata with Open Graph, Twitter Card, and meta tag values
richResults with readiness level, reasons, candidate types, missing fields, and issue codes

You can export the dataset as JSON, CSV, Excel, XML, RSS, or HTML, or read the same rows through the Apify API, schedules, webhooks, and integrations.

🚀 How to run it

Add one or more public page URLs in Page URLs.
Keep Audit focus on Full structured-data audit for the broadest output.
Choose Submitted URLs only for exact page checks.
Choose Follow same-site links when you want a bounded same-site schema audit.
Set Maximum pages to control output size and cost.
Run the Actor and open the dataset.

For a quick first run, keep the prefilled schema.org URLs. They are public pages with structured data, so you can inspect the output shape quickly before adding your own website.

⚙️ Input example

{
	"startUrls": [
		{ "url": "https://schema.org/Article" },
		{ "url": "https://schema.org/Product" }
	],
	"auditFocus": "full",
	"crawlScope": "submittedUrls",
	"maxPages": 25
}

🎯 Audit focus

Use full when you want JSON-LD, Microdata, RDFa, Open Graph, Twitter Cards, canonical, and meta tags. Use schemaOrg when you only want schema.org markup surfaces. Use jsonLd for a focused JSON-LD validator run.

🧭 Crawl scope

Use submittedUrls when your input list already contains every page you want to check. Use sameSite when one submitted page should discover more pages on the same website. maxPages caps the total successful page audits saved by the run.

📦 Output example

{
	"url": "https://schema.org/Article",
	"title": "Article - Schema.org Type",
	"canonicalUrl": "https://schema.org/Article",
	"statusCode": 200,
	"contentType": "text/html",
	"crawlDepth": 0,
	"schemaTypes": ["Article", "WebPage"],
	"markupSummary": {
		"hasStructuredData": true,
		"jsonLdBlocks": 1,
		"microdataItems": 0,
		"rdfaItems": 0,
		"openGraphProperties": 0,
		"twitterCardProperties": 0,
		"metaTags": 2
	},
	"validationStatus": "warning",
	"issueCounts": {
		"errors": 0,
		"warnings": 2,
		"info": 0
	},
	"issues": [
		{
			"severity": "warning",
			"code": "missing-recommended-property",
			"message": "Article is missing the recommended image property.",
			"format": null,
			"schemaType": "Article",
			"property": "image",
			"evidence": null
		}
	],
	"jsonLd": [
		{
			"index": 1,
			"valid": true,
			"context": "https://schema.org",
			"types": ["Article"],
			"data": {
				"@context": "https://schema.org",
				"@type": "Article",
				"headline": "Schema.org Article"
			},
			"issues": []
		}
	],
	"microdata": [],
	"rdfa": [],
	"metadata": {
		"openGraph": {},
		"twitterCard": {},
		"metaTags": {
			"description": "Schema.org page description"
		}
	},
	"richResults": {
		"readiness": {
			"level": "needsFixes",
			"reasons": ["Article is missing recommended image."]
		},
		"candidates": [
			{
				"type": "Article",
				"eligible": true,
				"requiredMissing": [],
				"recommendedMissing": ["image"],
				"issueCodes": ["missing-recommended-property"]
			}
		]
	}
}

💳 Pricing

This Actor uses pay-per-event pricing. You are charged for each successful page audit saved to the dataset with the page-validated event. Pages that cannot be fetched or audited are logged as handled non-results and are not saved as dataset rows.

⚠️ Limits and caveats

Pages must be public and reachable over http or https.
The Actor checks markup present in fetched HTML. Markup that only appears after private login flows or unsupported client-side states may not be visible.
Rich-result readiness is a deterministic markup check, not a Google Search Console verdict and not an AI-generated recommendation.
Same-site crawling is bounded by maxPages and follows same-origin links from submitted pages.

❓ FAQ

🧪 Can this replace Google's Rich Results Test?

Use it for batch audits, exports, API workflows, and structured-data evidence. Treat Google's tools as the final authority for Google-specific display eligibility.

🧩 Does it validate only JSON-LD?

No. The full audit extracts JSON-LD, Microdata, RDFa, Open Graph, Twitter Cards, canonical, and meta tags. Choose jsonLd when you want a focused JSON-LD validator run.

🌐 Can I audit a whole website?

You can start from one or more pages and choose same-site crawling with a page limit. For very large websites, use smaller batches or curated URL lists so each run stays easy to review.

📝 Changelog

0.1: Initial release.

🆘 Support

For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡

🔗 Other actors

Sitemap Sniffer ↗ - Find sitemap files and sitemap URL inventories before SEO audits.
Website URL Crawler ↗ - Build URL inventories from public websites, rendered links, and sitemaps.
Font Detector ↗ - Audit website fonts, font files, and typography evidence from public pages.
Ahrefs Free Website Stats Scraper ↗ - Collect public Ahrefs website metrics for domain research.
SEMrush Free Website Stats Scraper ↗ - Collect public SEMrush overview metrics for SEO research.

Made with ❤️ by Maxime Dupré

Schema Markup & JSON-LD Scraper - Structured Data API

pink_comic/schema-markup-extractor

Extract schema markup, JSON-LD, Open Graph, Twitter Cards, and meta tags from any URL. Structured data scraper/API for SEO audits, rich result checks, schema validation, and competitor research.

Ava Torres

Schema Markup Validator

automation-lab/schema-markup-validator

Validate JSON-LD, Microdata, RDFa, Open Graph, and Twitter Cards across public pages and sitemaps for bulk structured-data SEO QA.

Stas Persiianenko

Schema.org Markup Validator

scrappy_garden/schema-org-markup-validator

Validate Schema.org structured data for SEO. Parses JSON-LD, detects Microdata and RDFa, highlights schema types, and reports common issues like invalid JSON-LD, missing @type, non-schema.org @context, and missing key properties for popular schema types.

Bikram Adhikari

Schema Markup Validator

glowing_glove/schema-markup-validator

Extract and validate JSON-LD/schema.org markup from public webpages for SEO, ecommerce, SaaS, and publisher audits.

Ushba Khan

Schema Markup Extractor - JSON-LD SEO Data

benthepythondev/schema-markup-extractor

Extract Schema.org JSON-LD structured data from web pages, including schema types, nodes, block counts and parse errors.

Ben

Structured Data Validator (JSON-LD / OG)

jungle_synthesizer/structured-data-validator-pro

Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.

BowTiedRaccoon

Structured Data Extractor - JSON-LD, Microdata & RDFa

scrappy_garden/structured-data-extractor

Extract and validate structured data from any web page for SEO. Parses JSON-LD, detects Microdata and RDFa, highlights schema.org types, and reports common markup issues.

Bikram Adhikari

Schema Markup Scraper & SEO Auditor

autofacts/metadata-scraper

Extract JSON-LD, Microdata, RDFa, Open Graph & Twitter Cards. Runs a 0-100 SEO audit — checks canonical, hreflang, headings, image alt, EEAT author signals. Detects 80+ schema.org types including LocalBusiness with NAP, geo coordinates, and Google Place IDs.

Richard Feng

153

5.0

Public Structured Data & Rich Results Readiness Agent

jacksu/public-structured-data-readiness-agent

Audit public JSON-LD, Microdata, RDFa, Schema.org types, rich-result candidate signals, missing recommended fields, and change hashes with useful-result pricing.