Pricing

Pay per event

URL Metadata Scraper - OG, Twitter, JSON-LD

Extract complete metadata from any URL: Open Graph tags, Twitter Card metadata, JSON-LD structured data, favicons, hreflang alternates, canonical URLs and HTML meta. Perfect for link previews, SEO audits, social media tools, bookmark managers and content aggregators.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Mohieldin Mohamed

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

URL Metadata Scraper - OG, Twitter Card, JSON-LD

Extract every piece of metadata from any URL in one API call. URL Metadata Scraper pulls Open Graph tags, Twitter Card metadata, JSON-LD structured data, favicons, hreflang alternates, canonical URLs, and basic HTML meta - everything you need for link previews, SEO audits, social media tools, content aggregators, and bookmark managers.

Pass in a list of URLs, get back a normalized JSON payload with title, description, image, site name, author, language, structured data types, and much more. No custom selectors. No broken previews.

What does URL Metadata Scraper do?

This actor fetches each URL and extracts a complete metadata profile:

Normalized top-level fields - title, description, image, site name, type, author, language (picks the best value across OG, Twitter, and HTML meta)
Open Graph - All og:* properties (title, description, image, type, site_name, locale, video, audio, article, book, profile, ...)
Twitter Card - All twitter:* properties (card, title, description, image, creator, site, player, ...)
JSON-LD structured data - Every <script type="application/ld+json"> block parsed and returned as-is, plus a normalized list of schema types for quick filtering
Favicons - All declared icons including apple-touch-icon, mask-icon, and manifest
hreflang alternates - Language variants of the page
Canonical URL - The authoritative URL after redirects and <link rel="canonical">
Basic meta - keywords, author, viewport, theme-color, description
HTTP info - final URL, status code, content-type, content-language

Why use URL Metadata Scraper?

Link preview cards - Build Discord, Slack, or iMessage-style link previews without maintaining your own parser
Social media tools - Preview how a URL will appear when shared on Facebook, Twitter, LinkedIn before hitting publish
SEO audits - Bulk-check Open Graph and structured data compliance across thousands of URLs in minutes
Content aggregators - Power "read later" apps, RSS readers, or news aggregators with rich metadata
Bookmark managers - Show users clean titles and thumbnails when they save URLs
Knowledge bases - Enrich internal documents, Slack links, or Notion pages with source metadata
AI agents - Give LLM-based agents structured context about any URL they encounter

Built on Apify: scheduling, API access, proxy rotation, webhooks, and monitoring out of the box.

How to use URL Metadata Scraper

Click Try for free and sign in to Apify
Paste your URLs into the URLs field
(Optional) Turn off JSON-LD or favicons to slim down the output
Click Start - results appear in seconds
Download as JSON, CSV, or Excel, or query via the Apify API

Input

{
    "startUrls": [
        { "url": "https://apify.com" },
        { "url": "https://github.com" }
    ],
    "includeJsonLd": true,
    "includeFavicons": true,
    "maxRequestsPerCrawl": 100
}

Field	Type	Description
`startUrls`	array	List of URLs to scrape. Each entry is `{ "url": "..." }`. Required.
`includeJsonLd`	boolean	Include JSON-LD structured data in the output. Default: true.
`includeFavicons`	boolean	Include favicons and apple-touch-icons. Default: true.
`maxRequestsPerCrawl`	integer	Safety cap on requests. Default: 100, max: 5000.

Output

{
    "url": "https://apify.com/",
    "statusCode": 200,
    "title": "Apify: Full-stack web scraping and data extraction platform",
    "description": "Build, deploy, and scale web scraping and automation Actors...",
    "image": "https://apify.com/og-image.png",
    "siteName": "Apify",
    "type": "website",
    "author": null,
    "language": "en",
    "canonicalUrl": "https://apify.com/",
    "keywords": ["web scraping", "data extraction", "automation"],
    "themeColor": "#ffffff",
    "viewport": "width=device-width, initial-scale=1",
    "openGraph": {
        "title": "Apify...",
        "description": "Build, deploy, and scale...",
        "image": "https://apify.com/og-image.png",
        "type": "website",
        "site_name": "Apify",
        "locale": "en_US"
    },
    "twitterCard": {
        "card": "summary_large_image",
        "site": "@apify",
        "title": "Apify...",
        "description": "...",
        "image": "https://apify.com/og-image.png"
    },
    "favicons": [
        { "rel": "icon", "href": "https://apify.com/favicon.ico", "sizes": null, "type": "image/x-icon" },
        { "rel": "apple-touch-icon", "href": "https://apify.com/apple-touch-icon.png", "sizes": "180x180", "type": null }
    ],
    "hreflangs": [],
    "jsonLd": [ ... ],
    "structuredDataTypes": ["Organization", "WebSite"],
    "contentType": "text/html; charset=utf-8",
    "contentLanguage": "en",
    "scrapedAt": "2026-04-13T19:42:17.301Z"
}

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

Output fields

Field	Type	Description
`title`	string	Best title found (OG > Twitter > `<title>`)
`description`	string	Best description found
`image`	string	Absolute URL of the primary image
`siteName`	string	Site name from OG
`type`	string	OG type (website, article, product, ...)
`canonicalUrl`	string	Canonical URL after redirect resolution
`openGraph`	object	All Open Graph properties
`twitterCard`	object	All Twitter Card properties
`favicons`	array	All favicon / apple-touch-icon links
`hreflangs`	array	Language alternates
`jsonLd`	array	Every parsed JSON-LD block
`structuredDataTypes`	array	Normalized list of schema.org types on the page

How much does it cost to extract URL metadata?

The actor uses a lightweight Cheerio crawler with no headless browser and 10 concurrent requests, so it is very cheap. Extracting 1,000 URLs typically costs pennies on the Apify free tier.

Tips and advanced options

Batch thousands of URLs - Paste a CSV or file of URLs; the actor processes them in parallel
Disable JSON-LD for lean output - If you only need OG tags for link previews, turn off includeJsonLd to shrink your dataset size by 50-80%
Schedule recurring scans - Use Apify Schedules to detect when a page's metadata changes (e.g., for monitoring competitor landing pages)
Integrate with your link preview service - Call the actor's API from your app every time a user pastes a URL and cache the result

FAQ

Which sites are supported? Any public URL that returns HTML. The extractor is site-agnostic and handles millions of pages per month reliably.

Is this legal? The actor reads only public HTML and the metadata the site voluntarily declares - exactly what your browser does when it renders a page. No terms of service are bypassed.

What if the site blocks scrapers? Add an Apify proxy configuration in the run settings and the actor will route through datacenter or residential IPs.

Can it handle JavaScript-rendered pages? Most metadata is in static HTML and works with Cheerio. For pages that render OG tags via JS, use a PlaywrightCrawler variant.

Support

Hit a site where metadata extraction fails or returns the wrong title? Open an issue with the URL and we will investigate.

URL Metadata Extractor - OG Tags, Twitter Cards, Favicons

george.the.developer/url-metadata-extractor

Extract Open Graph tags, Twitter cards, favicons, structured data (JSON-LD), and meta tags from any URL. Perfect for link preview generation.

George Kioko

Open Graph & Meta Tag Extractor

automation-lab/og-meta-extractor

This actor fetches any list of URLs and extracts all social media meta tags (Open Graph, Twitter Cards), SEO metadata (title, description, canonical, robots), structured data (JSON-LD), and internationalization (hreflang). Use it for social media audits, SEO analysis, link preview...

Stas Persiianenko

Meta Tags Extractor

krawlify/meta-tags-extractor

Extract SEO meta tags, Open Graph, Twitter Cards, JSON-LD structured data, and headings from any website. Perfect for SEO analysis, competitor research, and content audits.

Krawlify Krawlify

SEO Metadata Extractor - Full SEO Audit in One Call

santamaria-automations/seo-metadata-extractor

Extract SEO metadata from any website: title, meta description, Open Graph, Twitter Cards, Schema.org JSON-LD, canonical URLs, hreflang, and H1 structure. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Ale

Schema Markup Extractor - Structured Data & SEO

pink_comic/schema-markup-extractor

Extract JSON-LD structured data, Open Graph tags, Twitter Card metadata, and all meta tags from any URL. Returns @type values, schema objects, og: properties. Fast pure-HTTP SEO audit tool.

Ava Torres

Web Page Metadata Extractor — Title, OG Tags, Author & More

maged120/get-metadata

Extract all metadata from any web page in one request — title, meta description, Open Graph tags, Twitter Card data, canonical URL, author, publish date, and more.

Maged

Structured Data Validator (JSON-LD / OG)

jungle_synthesizer/structured-data-validator-pro

Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.

BowTiedRaccoon

Facebook Page SEO Scraper ⚡ | Open Graph & Meta Tags

premiumscraper/facebook-page-seo-scraper-open-graph-meta-tags

Extract full SEO and meta tag data from Facebook Pages and Profiles. Output includes: og:title, og:description, og:image, og:url, twitter:card, app store links, canonical URL, follower count, page category, and bio text. One structured row per page ✨ Facebook Page SEO Scraper⚡

Premium Scraper

JSON-LD Schema & Meta Tag Extractor

logiover/json-ld-schema-meta-tag-extractor

Extract JSON-LD/Schema.org structured data, Meta tags, OpenGraph and Twitter Cards from any URL. Get page title + meta description with a clean JSON output for SEO audits, validation, competitor research and AI datasets. Proxy-ready for large crawls.