Meta Tag & OpenGraph Scraper avatar

Meta Tag & OpenGraph Scraper

Pricing

Pay per event

Go to Apify Store
Meta Tag & OpenGraph Scraper

Meta Tag & OpenGraph Scraper

Crawl any website to extract schema.org JSON-LD, Open Graph tags, and robots directives for comprehensive technical SEO audits.

Pricing

Pay per event

Rating

0.0

(0)

Developer

naoki anzai

naoki anzai

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

11 days ago

Last modified

Share

๐Ÿท๏ธ Meta Tag Analyzer

Extracting structured data and meta tags from arbitrary websites is essential for technical SEO audits, site migrations, and competitor analysis. This automation tool crawls web pages to accurately extract schema.org JSON-LD, Open Graph tags, Twitter cards, viewport configurations, and crucial robots directives from hundreds of URLs in seconds. Built for SEO professionals, data engineers, and developers, this web scraper replaces manual element inspection by programmatically reading exactly how search engines and social media platforms view your website content.

Schedule weekly scraper runs to continuously monitor your site's health. Catch truncated page titles, missing canonical links, broken social preview images, or accidental noindex directives before they negatively impact your search results. Whether you need to scrape product details embedded via schema markup, validate organization data, or extract contact details from competitor websites, this tool delivers highly structured, perfectly formatted data.

The output details exactly what a browser sees when loading your pages. Concrete extracted fields include full HTML head metadata, canonical URLs, charset configurations, structured JSON-LD payloads, meta descriptions, and custom tags. Use these scraped results to feed dashboards, integrate with internal reporting tools, or conduct bulk technical validation across vast domain portfolios.

Store Quickstart

Start with the Quickstart template (3 demo URLs). For full SEO audits, use SEO Audit template with 50+ URLs. For ongoing social sharing validation, use Open Graph Monitor.

Key Features

  • ๐Ÿท๏ธ All meta tags extracted โ€” Title, description, keywords, canonical, robots, viewport, charset
  • ๐Ÿ“ฑ Open Graph parsed โ€” og:title, og:image, og:description, og:type, og:url and more
  • ๐Ÿฆ Twitter Card detected โ€” twitter:card, twitter:image, twitter:creator, twitter:site
  • ๐ŸŒ Hreflang inspected โ€” Multi-language alternate links for international SEO
  • ๐Ÿงฉ JSON-LD structured data โ€” Schema.org types detected (Article, Product, Organization, etc.)
  • โš ๏ธ Issues flagged โ€” Missing canonical, truncated titles, missing OG image, hreflang errors

Use Cases

WhoWhy
SEO specialistsAudit title/meta tags for length, missing canonical, duplicate content
Social media managersVerify Open Graph images render correctly on Facebook/LinkedIn
International teamsValidate hreflang tags on multilingual sites
Content marketersEnsure Twitter Cards display properly when content is shared
Schema.org auditorsDetect missing or malformed structured data on product pages

Input

FieldTypeDefaultDescription
urlsstring[](required)URLs to analyze (max 500)
concurrencyinteger10Parallel requests (1-10)

Input Example

{
"urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
"concurrency": 10
}

Input Examples

Example: Single URL audit

{
"urls": [
"https://example.com"
],
"includeOpenGraph": true
}

Example: Bulk audit

{
"urls": [
"https://a.com",
"https://b.com"
],
"emitSeoWarnings": true
}

Example: Duplicate-title detection

{
"urls": [
"https://example.com/",
"https://example.com/page1"
],
"detectDuplicateTitles": true
}

Output

FieldTypeDescription
urlstringPage URL analyzed
titlestringContents of
descriptionstringMeta description content
canonicalstringCanonical URL if specified
ogTitlestringOpen Graph title
ogDescriptionstringOpen Graph description
ogImagestringOpen Graph image URL
twitterCardstringTwitter card type
twitterTitlestringTwitter card title
robotsstringRobots meta directive
langstringHTML lang attribute

Output Example

{
"url": "https://example.com/product/1",
"title": "Premium Widget โ€” Example Store",
"description": "Buy the best widget...",
"canonical": "https://example.com/product/1",
"og": {"title": "Premium Widget", "image": "https://example.com/widget.jpg", "type": "product"},
"twitter": {"card": "summary_large_image"},
"hreflang": [{"lang": "en-us", "href": "..."}],
"jsonLd": [{"@type": "Product", "name": "Premium Widget"}],
"issues": []
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console โ†’ Settings โ†’ Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~meta-tag-analyzer/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "urls": ["https://example.com/product/1", "https://example.com/blog/post-1"], "concurrency": 10 }'

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/meta-tag-analyzer").call(run_input={
"urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
"concurrency": 10
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/meta-tag-analyzer').call({
"urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
"concurrency": 10
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

  • Use this to audit social sharing previews before launching a campaign.
  • Check canonical to detect duplicate content issues across your site.
  • Run monthly to catch accidental robots noindex tags after deploys.
  • Pair with Article Content Extractor for full content + metadata analysis.

FAQ

Which meta tags are most important for SEO?

title, description, canonical, and robots are critical. og:image matters for social shares. hreflang matters if you have multilingual content.

Can it follow redirects?

Yes. Meta tags are extracted from the final URL after redirects.

Does it render JavaScript?

No. Tags are extracted from initial HTML only. If your site renders tags client-side, this actor won't see them.

What issues are flagged?

Truncated titles (>60 chars), missing description, missing canonical, missing og:image, duplicate meta description.

Does it render JavaScript?

No โ€” it parses raw HTML. Tags injected by client-side JS won't be captured.

Can I extract structured data (JSON-LD)?

Basic Open Graph and Twitter Card metadata only. JSON-LD parsing is on the roadmap.

Complete Your Website Health Audit

Website Health Suite โ€” Build a comprehensive compliance and trust monitoring workflow:

1. Link & URL Health

2. SEO & Metadata Quality (you are here)

3. Security & Email Deliverability

4. Historical Data & Recovery

Recommended workflow: Weekly meta tag audit โ†’ Fix truncated titles and missing OG images โ†’ Validate structured data with Schema Validator โ†’ Monitor canonical tags โ†’ Track social sharing performance.

Other Website Tools:

Cost

Pay Per Event:

  • actor-start: $0.01 (flat fee per run)
  • dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 ร— $0.003) = $3.01

No subscription required โ€” you only pay for what you use.

โญ Was this helpful?

If this actor saved you time, please leave a โ˜… rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.