Pricing

$0.04 / 1,000 url reads

Link Preview & OpenGraph Metadata Extractor

Reads a page's own public head tags, OpenGraph, Twitter card, title, description, canonical, favicon, and language, for clean link previews and RAG ingestion. Respects robots.txt by default. Billed only per URL successfully read.

Pricing

$0.04 / 1,000 url reads

Rating

0.0

(0)

Developer

Pono Data

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

URL Metadata & OpenGraph Extractor

Give it a list of page URLs and get back the metadata each page publishes for previews: OpenGraph (og:*), Twitter card (twitter:*), <title>, meta description, canonical link, declared favicon, and page language. Clean, flat rows, built for link previews and for feeding RAG pipelines with consistent per-link metadata.

Input

URLs: one per line.
Respect robots.txt: when on (default), the host's robots.txt is checked and disallowed URLs are skipped.
Max delivered URLs: cap on billed rows (0 = no cap).

Output

One row per URL: url, finalUrl, httpStatus, title, description, canonical, the og* fields, the twitter* fields, favicon, lang, plus provenance (sourceUrl, retrievedAt, confidence, dataSource).

How it works

Sites publish these head tags specifically so other tools can render previews. The actor fetches each page politely with a declared User-Agent, reads only the head, and copies the tags verbatim. Relative og:image, canonical, and favicon URLs are resolved to absolute against the page URL; nothing else is transformed, and a tag the page does not declare is null, never invented. A URL that robots disallows, or that fails to fetch, is written to the free rejected dataset and is not billed. A site owner can ask us to skip their domain at https://ponodata.com/opt-out ; opted-out hosts are skipped and never charged.

Billing

Pay per URL successfully read. Robots-disallowed and failed URLs cost nothing.

Sample output

A real run reading each page's own public head tags (one row per URL):

URL	title	description	OG type
https://www.cloudflare.com	Cloudflare: Build for the…	Welcome to Cloudflare - Powering …	website
https://stripe.com	Stripe / Financial Infras…	Stripe is a financial services pl…	website
https://www.python.org	Welcome to Python.org	The official home of the Python P…	website
https://kubernetes.io	Kubernetes	Kubernetes, also known as K8s, is…	website

Every row carries a sourceUrl (the page read), for example https://www.cloudflare.com. Pages that return no metadata route to the free reject dataset.

Why this one

Reads only the page's own public head tags and copies them verbatim, with a sourceUrl to the page. Nothing is taken from the body or guessed.
Follows redirects safely: it records the final URL and screens every redirect hop, so a public URL cannot be turned into a request to an internal host.
You pay only for a URL that returns metadata. Robots-disallowed and failed URLs are free.
Typed, flat rows with a stable schema, so an agent or a pipeline can consume them by field name.

Use cases

Render link previews at scale: title, description, and og:image for a list of URLs, the same tags a chat app or social card reads.
Feed a RAG pipeline consistent per-link metadata: one typed row per URL, so a retriever indexes a clean title and description instead of raw HTML.
Audit a site's social and SEO tags: find pages missing a description, a canonical link, or an OpenGraph image.
Normalize a mixed link list: each row carries the finalUrl after redirects and the page language.

FAQ

Does it read the whole page? No. It reads the head tags only and copies them verbatim; a tag the page does not declare is null, never invented.
Does it respect robots.txt? Yes by default; a disallowed URL is skipped to the free rejected dataset, unbilled.
Is it safe against redirect tricks? Yes. It screens every redirect hop, so a public URL that points at an internal address is refused before it connects.
How am I billed? Per URL that returns metadata. Robots-disallowed and failed URLs cost nothing.

Extract the link-preview metadata a page publishes for itself - what Slack, Twitter/X and LinkedIn read when you paste a URL. Returns title, description, Open Graph & Twitter-card tags, favicon, JSON-LD, images. Pure code, no proxy/AI. Reads PUBLIC meta only; does NOT scrape private social accounts.

Ahmed Moussa

SEO Audit Tool — Meta Tags, OpenGraph & Schema Extractor

fanciful_geode/seo-audit-tool

Extract SEO data from any URL: title, meta description, OpenGraph tags, Twitter Cards, JSON-LD schema, heading hierarchy, canonical URLs, and more. Bulk URL support.

Ernesto de Armas

Link Preview & OpenGraph Metadata Extractor

URL Metadata & OpenGraph Extractor

Input

Output

How it works

Billing

Sample output

Why this one

Use cases

FAQ

See also

URL Metadata — OpenGraph, Twitter Card & Favicon

URL Metadata & OpenGraph Extractor

URL Meta Card Generator — OpenGraph & Twitter Cards

Link Preview API — Bulk OpenGraph & Metadata Unfurl

Website SEO & Metadata Checker — Meta, OpenGraph, JSON-LD

OpenGraph & Social Card Inspector — Meta Tag Extractor API

Webpage Metadata Extractor - OpenGraph, Twitter Cards, JSON-LD

Link Preview & URL Metadata API: Open Graph Extractor

Open Graph & Link Preview Extractor

SEO Audit Tool — Meta Tags, OpenGraph & Schema Extractor

Link Preview & OpenGraph Metadata Extractor

URL Metadata & OpenGraph Extractor

Input

Output

How it works

Billing

Sample output

Why this one

Use cases

FAQ

See also

You might also like

URL Metadata — OpenGraph, Twitter Card & Favicon

URL Metadata & OpenGraph Extractor

URL Meta Card Generator — OpenGraph & Twitter Cards

Link Preview API — Bulk OpenGraph & Metadata Unfurl

Website SEO & Metadata Checker — Meta, OpenGraph, JSON-LD

OpenGraph & Social Card Inspector — Meta Tag Extractor API

Webpage Metadata Extractor - OpenGraph, Twitter Cards, JSON-LD

Link Preview & URL Metadata API: Open Graph Extractor

Open Graph & Link Preview Extractor

SEO Audit Tool — Meta Tags, OpenGraph & Schema Extractor