Pricing

Pay per usage

Go to Store

Metadata Extractor

Try for free

Developed by

Jan Čurn

A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-data from the <HEAD> tag, such as page title, description, author etc.

0.0 (0)

Pricing

Pay per usage

Total users

1.3k

Monthly users

Runs succeeded

87%

Last modified

2 years ago

Developer tools

Open source

You can access the Metadata Extractor programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

# Set API token
$API_TOKEN=<YOUR_API_TOKEN>

# Prepare Actor input
$cat > input.json << 'EOF'
<{
<  "urls": [
<    "https://www.apify.com/",
<    "https://blog.apify.com"
<  ],
<  "proxy": {
<    "useApifyProxy": true
<  }
<}
<EOF

# Run the Actor using an HTTP API
# See the full API reference at https://docs.apify.com/api/v2
$curl "https://api.apify.com/v2/acts/jancurn~extract-metadata/runs?token=$API_TOKEN" \
<  -X POST \
<  -d @input.json \
<  -H 'Content-Type: application/json'

Metadata Extractor API

Below, you can find a list of relevant HTTP API endpoints for calling the Metadata Extractor Actor. For this, you’ll need an Apify account. Replace <YOUR_API_TOKEN> in the URLs with your Apify API token, which you can find under Integrations in Apify Console. For details, see the API reference.

Run Actor

POST

https://api.apify.com/v2/acts/jancurn~extract-metadata/runs?token=<YOUR_API_TOKEN>

Note: By adding the method=POST query parameter, this API endpoint can be called using a GET request and thus used in third-party webhooks. Please refer to our Run Actor API documentation.

Run Actor synchronously and get dataset items

POST

https://api.apify.com/v2/acts/jancurn~extract-metadata/run-sync-get-dataset-items?token=<YOUR_API_TOKEN>

Note: This endpoint supports both POST and GET request methods. However, only the POST method allows you to pass input data. For more information, please refer to our Run Actor synchronously and get dataset items API documentation.

Get Actor

GET

https://api.apify.com/v2/acts/jancurn~extract-metadata?token=<YOUR_API_TOKEN>

For more information, please refer to our Get Actor API documentation.

Actors can be used to scrape web pages, extract data, or automate browser tasks. Use the Metadata Extractor API programmatically via the Apify API.

You can choose from:

Metadata Extractor API in Python

Metadata Extractor API in JavaScript

Metadata Extractor API through CLI

Metadata Extractor OpenAPI definition

You can start Metadata Extractor with the Apify API by sending an HTTP POST request to the Run Actorendpoint. An Actor’s input and its content type can be passed as a payload of the POST request, and additional options can be specified using URL query parameters. The Metadata Extractor is identified within the API by its ID, which is the creator’s username and the name of the Actor.

When the Metadata Extractor run finishes you can list the data from its default dataset(storage) via the API or you can preview the data directly on Apify Console.

Meta Data Extractor

dainty_screw/metadata-extractor-reliable-web-page-metadata-extraction

Metadata Extractor is your go-to tool for extracting meta-data from web pages. Using Cheerio, it parses HTML to extract titles, descriptions, authors, and more.Perfect for content managers and SEO experts.

codemaster devops

Metadata Scraper

louisdeconinck/metadata-scraper

Automatically scrape metadata such as title, description, heading and article from websites. It will crawl the start URLs and then scrape the metadata from the detail pages automatically navigating through the pagination.

Louis Deconinck

Website Metadata Extractor (meta tags, sitemap, robots)

powerful_bachelor/website-metadata-extractor

🔍 Website Metadata Extractor 🌐 Extract essential website data: meta tags, robots.txt, and sitemap.xml in one scan. 📊 Analyze SEO elements, crawler directives, and site structure. ✅ Perfect for SEO audits, 🔎 competitor research, and 🚀 understanding how search engines view your website.

Powerful Bachelor

URL Metadata Crawler

easyapi/url-metadata-crawler

Extracting comprehensive metadata from web pages. Gather vital information like meta tags, favicons, Open Graph tags, and more, all while enjoying flexible options for customization. Perfect for SEO specialists, developers, and content creators looking to enhance their web presence! 🌐

EasyApi

URL to Metadata

njoylab/url-summary-scraper

A powerful Apify actor that extracts essential website information, including title, description, images, and social media links. Perfect for quick data gathering and insights from any URL.

njoylab

Example Sitemap Cheerio

jancurn/example-sitemap-cheerio

An example actor that first downloads a sitemap in XML format and the crawls each page from the sitemap using the fast CheerioCrawler from Apify SDK.

Jan Čurn

Favicon Scraper & Archiver

embion/favicon-scraper-archiver

Automatically discover, download, and archive favicons from a list of websites. Ensuring you get the icons you need in a clean and organized manner. Supported formats: SVG, PNG, ICO

Embion

Apify public actor scraper

eloquent_mountain/apify-public-actor-scraper

Apify Store Actor Scraper Efficiently scrape comprehensive data from Apify Store actors using Selenium. Extracts key information such as URL, Title, Creator, Description, Users, and Stars

Paco

Metadata Scraper

autofacts/metadata-scraper

A powerful web scraper that extracts various types of structured metadata from web pages, including JSON-LD, Microdata, Open Graph, Twitter Cards, and more. Perfect for SEO analysis, content aggregation, and research purposes.

Autofactor

Web Accessibility Scanner

accessibility_team/a11y-scanner-public

Looking for an Accessibility Checker API for WCAG compliance? Our tool scans per the latest guidelines, offering bypass login screens, detailed error reports, and automated scanning. Ideal for inclusive design, and accessibility testing. Ensure your site meets accessibility standards.