
Structured Data Scraper (Schema.org)
Pricing
Pay per event

Structured Data Scraper (Schema.org)
Fast, lightweight scraper that extracts structured data (JSON-LD & microdata) from HTML pages. Ideal for e-commerce and sites that embed schema.org markup without heavy client-side rendering.
0.0 (0)
Pricing
Pay per event
0
5
5
Last modified
8 days ago
Fast scraper optimized for sites that follows schema.org structured data without heavy client-side rendering. It is great for e-commerce sites.
Speed first. Lightweight because it parses static HTML instead of launching a browser. Pages that require client-side rendering may need a headless browser (for example Playwright or Puppeteer).
What you get
- Schema.org payloads collected from JSON-LD
<script>
tags and microdata attributes. - Final URL, status code, and page title for quick validation.
- Dataset output suitable for feeding into validation tools or downstream pipelines.
Input
Provide at least one URL via url
(string, array, or Apify request object) or urls
(array). Optional settings:
maxRequestsPerCrawl
– stop the crawl after N requests (defaults to the number of provided URLs).proxyConfiguration
– standard Apify proxy configuration block.
Output
Each dataset item contains:
inputUrl
,loadedUrl
,statusCode
,title
,retrievedAt
schema.jsonLd
– parsed JSON-LD blocksschema.microdata
– microdata trees normalised into nested objects
Sample INPUT.json
{"url": [{"url": "https://schema.dev/blog/schema-markup-builder-video-walkthroughs/"},{"url": "https://schema.dev/blog/schema-seo-boost-your-websites-visibility-with-structured-data/"},{"url": "https://schema.dev/blog/schema-tests-unleashing-the-full-potential-of-your-seo-strategy/"},{"url": "https://schema.dev/blog/understanding-product-schema-a-key-to-better-product-visibility-online/"},{"url": "https://schema.dev/blog/5-types-of-schema-markup-every-legal-service-should-use-for-seo/"}]}
On this page
Share Actor: