Pricing

Pay per usage

Metadata Extractor

A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-data from the <HEAD> tag, such as page title, description, author etc.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Jan Čurn

Actor stats

Bookmarked

1.4K

Total users

Monthly active users

3 years ago

Last modified

Categories

Developer tools

Open source

You can access the Metadata Extractor programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

1import { ApifyClient } from 'apify-client';
2
3// Initialize the ApifyClient with your Apify API token
4// Replace the '<YOUR_API_TOKEN>' with your token
5const client = new ApifyClient({
6    token: '<YOUR_API_TOKEN>',
7});
8
9// Prepare Actor input
10const input = {
11    "urls": [
12        "https://www.apify.com/",
13        "https://blog.apify.com"
14    ],
15    "proxy": {
16        "useApifyProxy": true
17    }
18};
19
20// Run the Actor and wait for it to finish
21const run = await client.actor("jancurn/extract-metadata").call(input);
22
23// Fetch and print Actor results from the run's dataset (if any)
24console.log('Results from dataset');
25console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
26const { items } = await client.dataset(run.defaultDatasetId).listItems();
27items.forEach((item) => {
28    console.dir(item);
29});
30
31// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

Metadata Extractor API in JavaScript

The Apify API client for JavaScript is the official library that allows you to use Metadata Extractor API in JavaScript or TypeScript, providing convenience functions and automatic retries on errors.

Install the apify-client

$npm install apify-client

Other API clients include:

Metadata Extractor API in Python

Metadata Extractor API through CLI

Metadata Extractor OpenAPI definition

Metadata Extractor API

Meta Data Extractor

dainty_screw/metadata-extractor-reliable-web-page-metadata-extraction

Metadata Extractor is your go-to tool for extracting meta-data from web pages. Using Cheerio, it parses HTML to extract titles, descriptions, authors, and more.Perfect for content managers and SEO experts.

codemaster devops

5.0

Cheerio Scraper

apify/cheerio-scraper

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

Apify

18K

4.6

Meta Tags Scraper

rl1987/meta-tags-scraper

Web page metadata scraper.

R.L.

Web Page Metadata Extractor — Title, OG Tags, Author & More

maged120/get-metadata

Extract all metadata from any web page in one request — title, meta description, Open Graph tags, Twitter Card data, canonical URL, author, publish date, and more.

Maged

Price Drop Tracker - Monitor Any E-commerce Product

alizarin_refrigerator-owner/price-drop-tracker---monitor-any-e-commerce-product

Actor for scraping data from a single web page. The URL of the web page is passed in via input, defined by the input schema. It uses the Axios client to get the HTML of the page & the Cheerio library to parse the data from it. The data are then stored in a dataset where you can easily access them.

The Howlers

Noon Product Info Scraper

getdataforme/noon-productInfo-scraper

Project Cheerio Crawler Typescript is a web scraping tool that extracts detailed product data from e-commerce sites using the Cheerio library....

GetDataForMe

Meta Ad Library Page Resolver

adside/meta-ad-library-page-resolver

Find a brand's Meta & Facebook advertiser pages in the Meta Ad Library by name. Get the page ID, profile, likes, verification & Instagram handle.

Adside

Redwoodcity Profile Scraper

getdataforme/redwoodcity-profile-scraper

Project Cheerio Crawler Typescript is a web scraping tool using the Cheerio library to efficiently extract structured data from multiple web pages....

GetDataForMe

Metascraper — Web Metadata Extractor

ntriqpro/metascraper-actor

Extract structured metadata (title, description, author, image, publisher, date) from any web page using the metascraper library.

daehwan kim

Universal Metadata Extractor

umair.log/universal-metadata-extractor

An Apify Actor that accepts a single URL and extracts two types of structured data from the page, Meta Data and website Contacts using plain HTTP requests,