Schema Markup Validator
Pricing
Pay per event
Schema Markup Validator
Validate JSON-LD, Microdata, RDFa, Open Graph, and Twitter Cards across public pages and sitemaps for bulk structured-data SEO QA.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Bulk validate structured data, schema.org markup, JSON-LD, Microdata, RDFa, Open Graph, and Twitter Cards from public web pages.
Use this actor when you need repeatable SEO QA at scale: crawl a list of URLs, expand an XML sitemap, detect schema types, parse markup, and export page-level warnings before a release, migration, or content audit.
What does Schema Markup Validator do?
Schema Markup Validator fetches public HTML pages and inspects the markup that search engines and social platforms use.
It extracts and validates:
- ✅ JSON-LD blocks from
application/ld+jsonscripts - ✅ Microdata entities from
itemscopeanditemprop - ✅ RDFa entities from
typeofandproperty - ✅ schema.org type names across all detected formats
- ✅ Open Graph meta tags such as
og:titleandog:description - ✅ Twitter Card tags such as
twitter:card - ✅ JSON parse errors and missing context/type warnings
- ✅ local rich-result readiness hints for common schema types
The output is one dataset row per URL, which makes it easy to export to CSV, JSON, Google Sheets, BI tools, or automated QA pipelines.
Who is it for?
This actor is built for teams that manage SEO-critical websites and need consistent structured-data checks.
- 🔍 Technical SEO agencies auditing client templates
- 📰 Publishers validating Article and NewsArticle pages
- 🛒 Ecommerce teams checking Product schema before launches
- 🏢 Local SEO teams checking LocalBusiness and Organization markup
- 🧑💻 Developers adding schema.org markup to templates
- 📈 Growth teams monitoring regression after CMS changes
- 🧪 QA teams adding SEO checks to release workflows
Why use it?
Manual validators are useful for one page, but they are slow for dozens or thousands of pages.
Schema Markup Validator is designed for repeatable bulk audits:
- Run the same validation before every release
- Compare schema coverage across page templates
- Spot invalid JSON-LD in large URL lists
- Export issues to spreadsheets or ticket systems
- Monitor important pages after CMS or theme changes
- Validate social-card metadata alongside schema markup
Data you can extract
| Field | Description |
|---|---|
url | Input page URL |
finalUrl | Final URL after redirects |
statusCode | HTTP response status |
pageTitle | HTML title text |
canonicalUrl | Canonical link URL when present |
schemaTypes | Detected schema.org types |
jsonLdCount | Number of JSON-LD blocks |
microdataCount | Number of Microdata entities |
rdfaCount | Number of RDFa entities |
openGraphCount | Number of Open Graph tags |
twitterCardCount | Number of Twitter Card tags |
errors | Blocking validation errors |
warnings | Non-blocking quality warnings |
richResultHints | Local required/recommended field hints |
jsonLd | Parsed JSON-LD blocks |
microdata | Extracted Microdata entities |
rdfa | Extracted RDFa entities |
openGraph | Open Graph metadata |
twitterCard | Twitter Card metadata |
rawMarkup | Optional raw snippets for debugging |
fetchedAt | Validation timestamp |
How much does it cost to validate schema markup?
This actor uses pay-per-event pricing.
You pay a small run-start fee and then a per-page validation fee for each dataset row produced.
The exact live prices are shown on the Apify Store pricing tab. The actor is designed as an HTTP-first tool, so it avoids browser automation by default and keeps validation runs inexpensive.
Cost-control tips:
- Start with 10-25 representative URLs
- Use
maxPageswhen testing sitemaps - Disable raw markup output for smaller exports
- Crawl links only when you need discovery
- Use sitemaps for controlled bulk validation
How to use Schema Markup Validator
- Add page URLs to
startUrls. - Optionally add XML sitemap URLs to
sitemapUrls. - Set
maxPagesto the number of pages you want to validate. - Keep
crawlLinksdisabled unless you want link discovery. - Run the actor.
- Open the dataset table.
- Filter rows with errors or warnings.
- Export the dataset to CSV, JSON, XLSX, or your integration target.
Input example
{"startUrls": [{ "url": "https://schema.org/Article" },{ "url": "https://schema.org/Product" }],"maxPages": 25,"includeRawMarkup": false,"validateRichResultHints": true}
Sitemap input example
{"startUrls": [{ "url": "https://example.com/" }],"sitemapUrls": [{ "url": "https://example.com/sitemap.xml" }],"maxPages": 100,"crawlLinks": false}
Output example
{"url": "https://schema.org/Article","finalUrl": "https://schema.org/Article","statusCode": 200,"pageTitle": "Article - Schema.org Type","canonicalUrl": "https://schema.org/Article","schemaTypes": ["Article"],"jsonLdCount": 1,"microdataCount": 0,"rdfaCount": 0,"openGraphCount": 3,"twitterCardCount": 2,"errors": [],"warnings": [],"richResultHints": [{"type": "Article","eligible": false,"missingRequired": ["headline", "image"],"missingRecommended": ["publisher"]}]}
JSON-LD validation
The actor parses every application/ld+json script block independently.
It reports:
- invalid JSON syntax
- missing
@context - missing
@typeor@graph - detected schema.org types
- optional raw block text
This helps teams find broken template snippets without waiting for a search-engine recrawl.
Microdata validation
The actor extracts Microdata from elements with itemscope, itemtype, itemid, and itemprop.
Each entity includes the item type, optional ID, and detected properties.
This is useful for older templates, ecommerce themes, and CMS plugins that still generate Microdata instead of JSON-LD.
RDFa validation
The actor extracts RDFa-like entities from elements with typeof, property, resource, and about attributes.
RDFa is less common than JSON-LD, but many older sites and semantic templates still use it.
Open Graph checks
Open Graph tags control link previews on platforms such as Facebook, LinkedIn, Slack, and many messaging apps.
The actor extracts all og:* properties and warns when common core fields are missing.
Common tags include:
og:titleog:descriptionog:imageog:urlog:type
Twitter Card checks
Twitter Card metadata controls previews on X/Twitter and other tools that read twitter:* tags.
The actor extracts all Twitter Card tags and warns when twitter:card is missing.
Rich-result hints
The actor includes deterministic local hints for common schema types.
Supported hint families include:
- Article
- NewsArticle
- BlogPosting
- Product
- LocalBusiness
- Organization
- FAQPage
- HowTo
- Recipe
- Event
- JobPosting
- BreadcrumbList
These hints are not a replacement for Google's official tools. They are fast local checks for common required and recommended fields.
Integrations
You can connect Schema Markup Validator to many workflows:
- Send dataset rows to Google Sheets for SEO review
- Trigger Slack alerts when errors appear
- Store historical validation exports in S3
- Compare staging and production templates
- Add structured-data checks to release QA
- Feed warnings into Jira, Linear, or GitHub issues
- Monitor high-value product or article pages weekly
API usage with Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('automation-lab/schema-markup-validator').call({startUrls: [{ url: 'https://schema.org/Article' }],maxPages: 10,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
API usage with Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_APIFY_TOKEN')run = client.actor('automation-lab/schema-markup-validator').call(run_input={'startUrls': [{'url': 'https://schema.org/Article'}],'maxPages': 10,})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items)
API usage with cURL
curl -X POST "https://api.apify.com/v2/acts/automation-lab~schema-markup-validator/runs?token=$APIFY_TOKEN" \-H 'Content-Type: application/json' \-d '{"startUrls":[{"url":"https://schema.org/Article"}],"maxPages":10}'
MCP usage
Use the actor from Claude Desktop, Claude Code, or other MCP-compatible clients through Apify MCP Server.
MCP endpoint:
https://mcp.apify.com/?tools=automation-lab/schema-markup-validator
Claude Code setup:
$claude mcp add apify-schema-validator --transport http https://mcp.apify.com/?tools=automation-lab/schema-markup-validator
Claude Desktop JSON config:
{"mcpServers": {"apify-schema-validator": {"url": "https://mcp.apify.com/?tools=automation-lab/schema-markup-validator"}}}
Example prompts:
- "Validate schema markup for these 20 product URLs and summarize missing fields."
- "Check whether our article pages have valid JSON-LD and Open Graph tags."
- "Audit this sitemap and give me a CSV of pages missing Twitter Cards."
Tips for best results
- Validate representative template URLs first.
- Use sitemaps for controlled bulk audits.
- Keep
includeRawMarkupoff unless you need debugging snippets. - Use
crawlLinksonly for small site discovery runs. - Treat rich-result hints as local guidance, not official Google eligibility.
- Export results and track error counts over time.
Troubleshooting
Why do I see no structured data?
The page may not include schema markup in server-rendered HTML, or it may generate markup only in the browser after JavaScript runs. This actor is HTTP-first for cost and reliability.
Why does a page show rich-result warnings even with schema present?
The actor checks common required and recommended fields for popular schema types. A warning means the detected entity may be missing fields commonly expected for that rich-result family.
Why did a URL return status code 0?
Status code 0 means the request failed before a normal HTTP response was available. Check whether the site blocks automated requests, redirects unusually, or requires login.
Legality and ethical use
This actor validates public page markup supplied by the user. Use it only on websites you are allowed to audit and follow the target site's terms, robots policies, and applicable laws.
Do not use it to overload websites. Keep maxPages reasonable and run recurring audits at responsible intervals.
Related scrapers and SEO tools
Explore other automation-lab actors on Apify:
- https://apify.com/automation-lab/website-contact-finder
- https://apify.com/automation-lab/google-maps-lead-finder
- https://apify.com/automation-lab/bulk-url-status-checker
Changelog
- Initial version: HTTP-first schema.org, JSON-LD, Microdata, RDFa, Open Graph, and Twitter Card validation.
Support
If you need a validation field that is not included yet, open an issue on the Apify actor page with an example URL and the expected output.