Pricing

Pay per event

Meta Tag & OpenGraph Scraper

Crawl any website to extract schema.org JSON-LD, Open Graph tags, and robots directives for comprehensive technical SEO audits.

Pricing

Pay per event

Rating

0.0

(0)

Developer

naoki anzai

Actor stats

Bookmarked

Total users

Monthly active users

11 days ago

Last modified

🏷️ Meta Tag Analyzer

Extracting structured data and meta tags from arbitrary websites is essential for technical SEO audits, site migrations, and competitor analysis. This automation tool crawls web pages to accurately extract schema.org JSON-LD, Open Graph tags, Twitter cards, viewport configurations, and crucial robots directives from hundreds of URLs in seconds. Built for SEO professionals, data engineers, and developers, this web scraper replaces manual element inspection by programmatically reading exactly how search engines and social media platforms view your website content.

Schedule weekly scraper runs to continuously monitor your site's health. Catch truncated page titles, missing canonical links, broken social preview images, or accidental noindex directives before they negatively impact your search results. Whether you need to scrape product details embedded via schema markup, validate organization data, or extract contact details from competitor websites, this tool delivers highly structured, perfectly formatted data.

The output details exactly what a browser sees when loading your pages. Concrete extracted fields include full HTML head metadata, canonical URLs, charset configurations, structured JSON-LD payloads, meta descriptions, and custom tags. Use these scraped results to feed dashboards, integrate with internal reporting tools, or conduct bulk technical validation across vast domain portfolios.

Store Quickstart

Start with the Quickstart template (3 demo URLs). For full SEO audits, use SEO Audit template with 50+ URLs. For ongoing social sharing validation, use Open Graph Monitor.

Key Features

🏷️ All meta tags extracted — Title, description, keywords, canonical, robots, viewport, charset
📱 Open Graph parsed — og:title, og:image, og:description, og:type, og:url and more
🐦 Twitter Card detected — twitter:card, twitter:image, twitter:creator, twitter:site
🌍 Hreflang inspected — Multi-language alternate links for international SEO
🧩 JSON-LD structured data — Schema.org types detected (Article, Product, Organization, etc.)
⚠️ Issues flagged — Missing canonical, truncated titles, missing OG image, hreflang errors

Use Cases

Who	Why
SEO specialists	Audit title/meta tags for length, missing canonical, duplicate content
Social media managers	Verify Open Graph images render correctly on Facebook/LinkedIn
International teams	Validate hreflang tags on multilingual sites
Content marketers	Ensure Twitter Cards display properly when content is shared
Schema.org auditors	Detect missing or malformed structured data on product pages

Input

Field	Type	Default	Description
urls	string[]	(required)	URLs to analyze (max 500)
concurrency	integer	10	Parallel requests (1-10)

Input Example

{
  "urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
  "concurrency": 10
}

Input Examples

Example: Single URL audit

{
  "urls": [
    "https://example.com"
  ],
  "includeOpenGraph": true
}

Example: Bulk audit

{
  "urls": [
    "https://a.com",
    "https://b.com"
  ],
  "emitSeoWarnings": true
}

Example: Duplicate-title detection

{
  "urls": [
    "https://example.com/",
    "https://example.com/page1"
  ],
  "detectDuplicateTitles": true
}

Output

Field	Type	Description
`url`	string	Page URL analyzed
`title`	string	Contents of
`description`	string	Meta description content
`canonical`	string	Canonical URL if specified
`ogTitle`	string	Open Graph title
`ogDescription`	string	Open Graph description
`ogImage`	string	Open Graph image URL
`twitterCard`	string	Twitter card type
`twitterTitle`	string	Twitter card title
`robots`	string	Robots meta directive
`lang`	string	HTML lang attribute

Output Example

{
  "url": "https://example.com/product/1",
  "title": "Premium Widget — Example Store",
  "description": "Buy the best widget...",
  "canonical": "https://example.com/product/1",
  "og": {"title": "Premium Widget", "image": "https://example.com/widget.jpg", "type": "product"},
  "twitter": {"card": "summary_large_image"},
  "hreflang": [{"lang": "en-us", "href": "..."}],
  "jsonLd": [{"@type": "Product", "name": "Premium Widget"}],
  "issues": []
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~meta-tag-analyzer/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "urls": ["https://example.com/product/1", "https://example.com/blog/post-1"], "concurrency": 10 }'

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/meta-tag-analyzer").call(run_input={
  "urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
  "concurrency": 10
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/meta-tag-analyzer').call({
  "urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
  "concurrency": 10
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

Use this to audit social sharing previews before launching a campaign.
Check canonical to detect duplicate content issues across your site.
Run monthly to catch accidental robots noindex tags after deploys.
Pair with Article Content Extractor for full content + metadata analysis.

FAQ

Which meta tags are most important for SEO?

title, description, canonical, and robots are critical. og:image matters for social shares. hreflang matters if you have multilingual content.

Can it follow redirects?

Yes. Meta tags are extracted from the final URL after redirects.

Does it render JavaScript?

No. Tags are extracted from initial HTML only. If your site renders tags client-side, this actor won't see them.

What issues are flagged?

Truncated titles (>60 chars), missing description, missing canonical, missing og:image, duplicate meta description.

Does it render JavaScript?

No — it parses raw HTML. Tags injected by client-side JS won't be captured.

Can I extract structured data (JSON-LD)?

Basic Open Graph and Twitter Card metadata only. JSON-LD parsing is on the roadmap.

Complete Your Website Health Audit

Website Health Suite — Build a comprehensive compliance and trust monitoring workflow:

1. Link & URL Health

🔗 Broken Link Checker — Find broken links across your entire site structure
🔗 Bulk URL Health Checker — Validate HTTP status, redirects, SSL, and response times

2. SEO & Metadata Quality (you are here)

🏷️ Meta Tag Analyzer — Audit title tags, Open Graph, Twitter Cards, and hreflang
Schema.org Validator — Validate JSON-LD and Microdata with quality scoring

3. Security & Email Deliverability

DNS/DMARC Security Checker — Audit SPF, DKIM, DMARC, and MX records

4. Historical Data & Recovery

📚 Wayback Machine Checker — Find archived snapshots for content recovery

Recommended workflow: Weekly meta tag audit → Fix truncated titles and missing OG images → Validate structured data with Schema Validator → Monitor canonical tags → Track social sharing performance.

Other Website Tools:

Sitemap Analyzer — SEO sitemap audit
Site Governance Monitor — Robots.txt and schema monitoring
Domain Trust Monitor — SSL expiry and security headers

Cost

Pay Per Event:

actor-start: $0.01 (flat fee per run)
dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.

JSON-LD Schema & Meta Tag Extractor

logiover/json-ld-schema-meta-tag-extractor

Extract JSON-LD/Schema.org structured data, Meta tags, OpenGraph and Twitter Cards from any URL. Get page title + meta description with a clean JSON output for SEO audits, validation, competitor research and AI datasets. Proxy-ready for large crawls.

Logiover

Meta Tags Extractor

krawlify/meta-tags-extractor

Extract SEO meta tags, Open Graph, Twitter Cards, JSON-LD structured data, and headings from any website. Perfect for SEO analysis, competitor research, and content audits.

Krawlify Krawlify

Open Graph & Meta Tag Extractor

automation-lab/og-meta-extractor

This actor fetches any list of URLs and extracts all social media meta tags (Open Graph, Twitter Cards), SEO metadata (title, description, canonical, robots), structured data (JSON-LD), and internationalization (hreflang). Use it for social media audits, SEO analysis, link preview...

Stas Persiianenko

Meta Tag Analyzer

scrappy_garden/meta-tag-analyzer

Analyze SEO meta tags for any list of URLs: title tag, meta description, canonical URL, robots meta, Open Graph, Twitter Cards, viewport, and hreflang. Produces a structured report with warnings and an SEO score for audits and QA.

Bikram Adhikari

Website Metadata Extractor (meta tags, sitemap, robots) 🔎

powerful_bachelor/website-metadata-extractor

🔍 Website Metadata Extractor 🌐 Extract essential website data: meta tags, robots.txt, and sitemap.xml in one scan. 📊 Analyze SEO elements, crawler directives, and site structure. ✅ Perfect for SEO audits, 🔎 competitor research, and 🚀 understanding how search engines view your website.

Powerful Bachelor

Meta Tags Extractor

hairy_grape/meta-tags-extractor

Extract all SEO meta tags, Open Graph, Twitter Cards, and get an instant SEO score (0-100). Perfect for SEO audits, competitive analysis, and digital marketing. Analyze any website in seconds!

Ares Y

Schema.Org Json Ld Extractor

sync-network/schema-org-json-ld-extractor

Extract Schema.org JSON-LD structured data from any website. Fast, lightweight HTTP-based scraper that pulls all JSON-LD scripts - perfect for SEO analysis, product data extraction, and AI/RAG pipelines. No browser overhead.

Alam

Schema Markup Extractor - Structured Data & SEO

pink_comic/schema-markup-extractor

Extract JSON-LD structured data, Open Graph tags, Twitter Card metadata, and all meta tags from any URL. Returns @type values, schema objects, og: properties. Fast pure-HTTP SEO audit tool.

Ava Torres

Structured Data Validator (JSON-LD / OG)

jungle_synthesizer/structured-data-validator-pro

Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.