Universal Metadata Extractor avatar

Universal Metadata Extractor

Pricing

Pay per usage

Go to Apify Store
Universal Metadata Extractor

Universal Metadata Extractor

An Apify Actor that accepts a single URL and extracts two types of structured data from the page, Meta Data and website Contacts using plain HTTP requests,

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Umair Butt

Umair Butt

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

2 days ago

Last modified

Share

Universal Metadata & Contact Info Extractor

An Apify Actor that accepts a single URL and extracts two types of structured data from the page using plain HTTP requests (no browser required).

What it does

📋 Metadata extraction

FieldDescription
titlePage <title> tag
description<meta name="description">
keywords<meta name="keywords"> (split into array)
og:titleOpen Graph title
og:descriptionOpen Graph description
og:imageOpen Graph image URL (resolved to absolute)
og:urlOpen Graph URL
og:typeOpen Graph type
og:site_nameOpen Graph site name
twitter:titleTwitter Card title
twitter:descriptionTwitter Card description
twitter:imageTwitter Card image
twitter:cardTwitter Card type
canonical<link rel="canonical"> href
robots<meta name="robots">
author<meta name="author">
viewport<meta name="viewport">

📞 Contact Info extraction

FieldDescription
emailsAll unique email addresses found on the page
phone_numbersPhone numbers in E.164 format (e.g. +14155552671)
social_linksArray of {"platform": "...", "url": "..."} objects
contact_pageURL of a detected contact/about/support page

Input

Single field — just paste the URL:

{
"url": "https://example.com"
}

Output example

{
"url": "https://example.com",
"metadata": {
"title": "Example Domain",
"description": "This domain is for use in illustrative examples.",
"keywords": [],
"og:title": "",
"og:description": "",
"og:image": "",
"og:url": "",
"og:type": "",
"og:site_name": "",
"twitter:title": "",
"twitter:description": "",
"twitter:image": "",
"twitter:card": "",
"canonical": "https://example.com/",
"robots": "",
"author": "",
"viewport": ""
},
"contacts": {
"emails": ["info@example.com"],
"phone_numbers": ["+14155552671"],
"social_links": [
{"platform": "Twitter/X", "url": "https://x.com/example"},
{"platform": "LinkedIn", "url": "https://linkedin.com/company/example"}
],
"contact_page": "https://example.com/contact"
}
}

Technical details

  • HTTP client: httpx with HTTP/2 support and realistic browser headers
  • HTML parser: BeautifulSoup + lxml
  • Phone parsing: Google's libphonenumber via the phonenumbers Python library
  • Social platforms detected: Twitter/X, LinkedIn, Facebook, Instagram, YouTube, GitHub, TikTok, Pinterest, Reddit, Telegram, WhatsApp, Discord, Medium, Threads, Vimeo, Tumblr, Twitch