JSON-LD Schema & Meta Tag Extractor avatar
JSON-LD Schema & Meta Tag Extractor

Pricing

from $0.50 / 1,000 results

Go to Apify Store
JSON-LD Schema & Meta Tag Extractor

JSON-LD Schema & Meta Tag Extractor

Extract JSON-LD/Schema.org structured data, Meta tags, OpenGraph and Twitter Cards from any URL. Get page title + meta description with a clean JSON output for SEO audits, validation, competitor research and AI datasets. Proxy-ready for large crawls.

Pricing

from $0.50 / 1,000 results

Rating

0.0

(0)

Developer

Logiover Data

Logiover Data

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

21 hours ago

Last modified

Share

🧩 SEO Schema Extractor β€” JSON-LD (Schema.org) + Meta Tags + OpenGraph + Twitter Cards

Extract structured data and SEO metadata from any URL in seconds.
This Actor scrapes JSON-LD / Schema.org markup, standard meta tags, OpenGraph (OG) tags, and Twitter Cards and returns a clean, structured dataset ready for audits, validation, competitor research, and AI datasets.

If you are looking for a JSON-LD extractor, schema scraper, meta tag checker, OpenGraph scraper, or Twitter card validator, this Actor is built for high-signal output and automation.


βœ… What this Actor does

Given a list of URLs, the Actor:

  • Fetches each page and extracts:
    • Page title (<title>)
    • Meta description
    • JSON-LD blocks (<script type="application/ld+json">)
    • OpenGraph tags (og:*)
    • Twitter card tags (twitter:*)
  • Normalizes results into a single JSON dataset per URL
  • Adds a scrape timestamp for tracking and diffing

🎯 Best use cases

  • Technical SEO audits
    • Validate Schema.org coverage and consistency across pages
    • Catch missing/incorrect OpenGraph and Twitter metadata
  • Schema validation / QA
    • Find malformed JSON-LD and missing required properties
  • Competitor analysis
    • Reverse-engineer what schema types competitors use (Product, FAQ, Recipe, Article, Organization)
  • Content automation & AI training data
    • Build structured datasets that combine page metadata + schema
  • Large-scale site checks
    • Run across thousands of URLs (proxy-ready)

✨ Key features

  • JSON-LD / Schema.org extraction
  • Meta tags (title + description)
  • OpenGraph extraction (image, type, url, site_name, etc.)
  • Twitter Cards extraction (card type, image, title, description)
  • Clean output schema (easy to export and analyze)
  • Proxy support for large crawls and rate limiting resilience

πŸ›  How to Use

  1. Add your target pages under Target URLs
  2. Enable Proxy Configuration (recommended)
  3. Run the Actor
  4. Export results as JSON/CSV or connect downstream to your reporting system

βš™οΈ Input Configuration

startUrls (required)

List of URLs you want to audit.

proxyConfiguration (required)

Use proxies to avoid blocking, especially for large crawls.


βœ… Example Input (JSON)

{
"startUrls": [
{ "url": "https://www.imdb.com/title/tt0111161/" },
{ "url": "https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/" }
],
"proxyConfiguration": { "useApifyProxy": true }
}

πŸ“¦ Output Dataset (Schema Report)

Each dataset item includes:

url β€” scraped URL

title β€” HTML page title

description β€” meta description

jsonLd β€” array of extracted JSON-LD objects

openGraph β€” OpenGraph tag object (og:*)

twitter β€” Twitter Card tag object (twitter:*)

scrapeDate β€” scrape timestamp

Example Output

{
"url": "https://example.com/product/abc",
"title": "ABC Product β€” Example",
"description": "Buy ABC Product with fast shipping.",
"jsonLd": [
{
"@context": "https://schema.org",
"@type": "Product",
"name": "ABC Product",
"offers": { "@type": "Offer", "price": "49.99", "priceCurrency": "USD" }
}
],
"openGraph": {
"og:title": "ABC Product β€” Example",
"og:type": "product",
"og:image": "https://example.com/images/abc.jpg",
"og:url": "https://example.com/product/abc"
},
"twitter": {
"twitter:card": "summary_large_image",
"twitter:title": "ABC Product β€” Example"
},
"scrapeDate": "2026-01-13T12:00:00.000Z"
}

πŸ“Š Dataset View (Structured Data Overview)

The built-in view focuses on:

URL

Title

JSON-LD objects

OpenGraph tags

This helps you quickly spot:

missing schema

wrong schema types

absent OG image or OG type issues

πŸ”₯ Pro Tips (maximize SEO audit value)

  1. Discover schema types at scale

Export JSON and aggregate @type to see your coverage:

Organization

WebSite

Article / BlogPosting

Product

FAQPage

BreadcrumbList

Recipe

LocalBusiness

  1. Compare across templates

Run on:

homepage

category pages

product pages

blog posts …and detect template-level issues.

  1. Catch social preview problems

Missing og:image or wrong twitter:card is a common share-preview bug. This Actor helps you detect it quickly.

  1. Use scrapeDate for diffs

Run daily/weekly and diff outputs to detect regressions after deployments.

🧯 Troubleshooting

JSON-LD is empty

Possible reasons:

page does not implement JSON-LD

schema is injected dynamically (client-side)

request is blocked / rate-limited

Try:

enable proxy

test a single URL run first

OpenGraph/Twitter tags missing

Some sites do not implement them. For social-ready pages, they should exist.

Blocked responses

Enable proxy and reduce crawl scale per run if needed.

πŸ” SEO Keywords (what this Actor targets)

json-ld extractor

schema.org scraper

structured data extractor

meta tag checker

open graph scraper

twitter card validator

technical seo audit

rich results schema audit

competitor schema analysis

πŸ—Ί Roadmap

Planned enhancements:

validation mode (detect malformed JSON-LD + missing required fields)

schema type summary report per run

robots/meta directives extraction (noindex, canonical, hreflang)

multi-page crawling mode (follow internal links)

Google Rich Results style checks (optional)

Support & Feedback

Open an issue with:

sample URLs

which schema types you expect (Product, FAQ, Recipe, Article)

any fields you want added (canonical, hreflang, robots, etc.)