🏷️ Meta Tag Analyzer & Scraper avatar

🏷️ Meta Tag Analyzer & Scraper

Pricing

Pay per event

Go to Apify Store
🏷️ Meta Tag Analyzer & Scraper

🏷️ Meta Tag Analyzer & Scraper

Crawl any website to extract schema.org JSON-LD, Open Graph tags, and robots directives for comprehensive technical SEO audits.

Pricing

Pay per event

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

🏷️ Meta Tag Analyzer

Part of the Website Health Suite — Comprehensive website trust, compliance, and technical SEO monitoring.

Crawl websites and extract structured JSON-LD data, meta tags, and crucial page elements using a high-performance web scraper. This automation tool is engineered for developers, data engineers, and technical SEOs who need a reliable API to scrape deep website details at scale. Instead of manually inspecting elements in your browser, you can run this scraper to programmatically extract schema.org data, viewport settings, charset configurations, and custom meta tags from hundreds of URLs in seconds.

Essential for recurring SEO metadata audits — Schedule weekly runs to catch missing canonical tags, truncated titles, or broken Open Graph images before they hurt social sharing and search rankings. Whether you need to scrape product details via Schema markup, extract organization data to feed into other applications, or monitor competitor websites, this tool delivers perfectly formatted results. By evaluating how search engines and social platforms read your site, you can prevent broken previews and missing rich snippets.

The output provides a clean, machine-readable JSON structure containing arrays for json_ld_schema, open_graph_tags, robots_directives, and general meta_tags. You can quickly integrate the extraction process into your technical audits, tracking how Google and other search engine bots interpret your core web properties. Extracting this data using a robust setup allows you to export the scraped details directly to your data warehouse or BI tools, unlocking deep competitive intelligence and automated website governance.

Store Quickstart

Start with the Quickstart template (3 demo URLs). For full SEO audits, use SEO Audit template with 50+ URLs. For ongoing social sharing validation, use Open Graph Monitor.

Key Features

  • 🏷️ All meta tags extracted — Title, description, keywords, canonical, robots, viewport, charset
  • 📱 Open Graph parsed — og:title, og:image, og:description, og:type, og:url and more
  • 🐦 Twitter Card detected — twitter:card, twitter:image, twitter:creator, twitter:site
  • 🌍 Hreflang inspected — Multi-language alternate links for international SEO
  • 🧩 JSON-LD structured data — Schema.org types detected (Article, Product, Organization, etc.)
  • ⚠️ Issues flagged — Missing canonical, truncated titles, missing OG image, hreflang errors

Use Cases

WhoWhy
SEO specialistsAudit title/meta tags for length, missing canonical, duplicate content
Social media managersVerify Open Graph images render correctly on Facebook/LinkedIn
International teamsValidate hreflang tags on multilingual sites
Content marketersEnsure Twitter Cards display properly when content is shared
Schema.org auditorsDetect missing or malformed structured data on product pages

Input

FieldTypeDefaultDescription
urlsstring[](required)URLs to analyze (max 500)
concurrencyinteger10Parallel requests (1-10)

Input Example

{
"urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
"concurrency": 10
}

Output

FieldTypeDescription
urlstringPage URL analyzed
titlestringContents of
descriptionstringMeta description content
canonicalstringCanonical URL if specified
ogTitlestringOpen Graph title
ogDescriptionstringOpen Graph description
ogImagestringOpen Graph image URL
twitterCardstringTwitter card type
twitterTitlestringTwitter card title
robotsstringRobots meta directive
langstringHTML lang attribute

Output Example

{
"url": "https://example.com/product/1",
"title": "Premium Widget — Example Store",
"description": "Buy the best widget...",
"canonical": "https://example.com/product/1",
"og": {"title": "Premium Widget", "image": "https://example.com/widget.jpg", "type": "product"},
"twitter": {"card": "summary_large_image"},
"hreflang": [{"lang": "en-us", "href": "..."}],
"jsonLd": [{"@type": "Product", "name": "Premium Widget"}],
"issues": []
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~meta-tag-analyzer/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "urls": ["https://example.com/product/1", "https://example.com/blog/post-1"], "concurrency": 10 }'

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/meta-tag-analyzer").call(run_input={
"urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
"concurrency": 10
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/meta-tag-analyzer').call({
"urls": ["https://example.com/product/1", "https://example.com/blog/post-1"],
"concurrency": 10
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

  • Use this to audit social sharing previews before launching a campaign.
  • Check canonical to detect duplicate content issues across your site.
  • Run monthly to catch accidental robots noindex tags after deploys.
  • Pair with Article Content Extractor for full content + metadata analysis.

FAQ

Which meta tags are most important for SEO?

title, description, canonical, and robots are critical. og:image matters for social shares. hreflang matters if you have multilingual content.

Can it follow redirects?

Yes. Meta tags are extracted from the final URL after redirects.

Does it render JavaScript?

No. Tags are extracted from initial HTML only. If your site renders tags client-side, this actor won't see them.

What issues are flagged?

Truncated titles (>60 chars), missing description, missing canonical, missing og:image, duplicate meta description.

Does it render JavaScript?

No — it parses raw HTML. Tags injected by client-side JS won't be captured.

Can I extract structured data (JSON-LD)?

Basic Open Graph and Twitter Card metadata only. JSON-LD parsing is on the roadmap.

Complete Your Website Health Audit

Website Health Suite — Build a comprehensive compliance and trust monitoring workflow:

1. Link & URL Health

2. SEO & Metadata Quality (you are here)

3. Security & Email Deliverability

4. Historical Data & Recovery

Recommended workflow: Weekly meta tag audit → Fix truncated titles and missing OG images → Validate structured data with Schema Validator → Monitor canonical tags → Track social sharing performance.

Other Website Tools:

Cost

Pay Per Event:

  • actor-start: $0.01 (flat fee per run)
  • dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.