Webpage Metadata Extractor
Pricing
Pay per usage
Webpage Metadata Extractor
Extract title, meta tags, OpenGraph, Twitter cards, JSON-LD structured data, favicon, canonical, feeds and page stats from any URL.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Anthony Snider
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Get the full machine-readable metadata of any web page in one call — title, meta tags, OpenGraph, Twitter cards, JSON-LD structured data, favicon, canonical, feeds, hreflang, and page stats. No API key, no proxy setup, pay per page.
▶ Live on the Apify Store — run it instantly, or call it as an agent tool via Apify MCP.
Why
Building link previews, knowledge graphs, SEO tools, or feeding clean page metadata to an LLM/agent? This returns everything a page declares about itself as structured JSON — including the schema.org JSON-LD that powers rich results.
What it extracts
- Core: title, description, keywords, author, canonical, language, charset, viewport, robots, generator, theme-color, favicon
- OpenGraph (
og:*) and Twitter Card (twitter:*) — every property, as objects - JSON-LD structured data — fully parsed, plus the list of
schema.org@types found - Feeds (RSS/Atom) and hreflang alternates
- Stats: H1 list, H2 count, image count, internal vs external link counts, word count
Input
{ "url": "https://example.com" }
or bulk:
{ "urls": ["https://a.com", "https://b.com"], "maxUrls": 50 }
Output (per page)
{"url": "https://example.com","title": "Example Domain","description": "...","canonical": "https://example.com/","openGraph": { "og:title": "...", "og:image": "..." },"twitterCard": { "twitter:card": "summary_large_image" },"schemaTypes": ["Organization", "WebSite"],"jsonLd": [ { "@context": "https://schema.org", "@type": "Organization" } ],"counts": { "h1": 1, "images": 12, "internalLinks": 30, "externalLinks": 5, "words": 820 }}
Notes
Reads only the public HTML of the URL you provide. Single fetch per page (plus follows redirects) — fast and reliable on any site.