Pricing

Pay per usage

Go to Store

Metadata Extractor

Try for free

Developed by

Jan Čurn

A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-data from the <HEAD> tag, such as page title, description, author etc.

0.0 (0)

Pricing

Pay per usage

Total users

1.3K

Monthly users

Runs succeeded

86%

Last modified

2 years ago

Developer tools

Open source

The actor takes a list of URLs of web pages on input, loads the HTML, and then extracts metadata from the HTML. The result is stored as a JSON file into the default dataset.

For example, for https://www.apify.com, the JSON result looks as follows:

{
    "url": "https://www.apify.com/",
    "title": "Web Scraping, Data Extraction and Automation · Apify",
    "meta": {
        "X-UA-Compatible": "IE=edge,chrome=1",
        "viewport": "width=device-width,minimum-scale=1,initial-scale=1",
        "copyright": "Copyright&copy; 2019 Apify Technologies s.r.o. All rights reserved.",
        "keywords": "web scraper, web crawler, scraping, data extraction, API",
        "robots": "index,follow",
        "referrer": "origin",
        "googlebot": "index,follow",
        "description": "Apify extracts data from websites, crawls lists of URLs and automates workflows on the web. Turn any website into an API in a few minutes!",
        "twitter:card": "summary_large_image",
        "twitter:creator": "@apify",
        "fb:app_id": "1636933253245869",
        "og:url": "https://apify.com/",
        "og:type": "website",
        "og:title": "Web Scraping, Data Extraction and Automation · Apify",
        "og:description": "Apify extracts data from websites, crawls lists of URLs and automates workflows on the web. Turn any website into an API in a few minutes!",
        "og:image": "https://apify.com/img/og-image.png",
        "og:image:alt": "Apify",
        "og:image:width": "1200",
        "og:image:height": "630",
        "og:locale": "en_IE",
        "og:site_name": "Apify",
        "next-head-count": "19"
    }
}

On this page

Metadata extractor

Share Actor:

Meta Data Extractor

dainty_screw/metadata-extractor-reliable-web-page-metadata-extraction

Metadata Extractor is your go-to tool for extracting meta-data from web pages. Using Cheerio, it parses HTML to extract titles, descriptions, authors, and more.Perfect for content managers and SEO experts.

codemaster devops

Metadata Scraper

louisdeconinck/metadata-scraper

Automatically scrape metadata such as title, description, heading and article from websites. It will crawl the start URLs and then scrape the metadata from the detail pages automatically navigating through the pagination.

Louis Deconinck

5.0

Simple SEO Data Extractor

onescales/simple-seo-data-extractor

Grab SEO data from any webpage / URL and export the URL, Title Tag, Meta Description, Meta Keywords, Status Code, Canonical Tag and Meta Robots easily. Run the scraper for 1-100,000 pages. Run one time or on schedule or via API.

One Scales

5.0

General Purpose Web Scraping and Metadata Extraction

moving_beacon-owner1/my-actor-10

This project uses the Apify platform to scrape data from web pages, collect metadata, and store results in an Apify dataset. It features functions for managing date ranges, encoding identifiers, and handling large datasets, aiming to efficiently extract and store structured data for analysis.

Jamshaid Arif

Single page web scraping

krishnapada.m.99/single-page-web-scraping

Scrapes the <title> tag or H1 tag from a single webpage provided by the user. Useful for SEO audits or content previews.

Somnath Mandal

Metadata Scraper

autofacts/metadata-scraper

A powerful web scraper that extracts various types of structured metadata from web pages, including JSON-LD, Microdata, Open Graph, Twitter Cards, and more. Perfect for SEO analysis, content aggregation, and research purposes.

Autofactor

5.0

Website Metadata Extractor (meta tags, sitemap, robots) 🔎

powerful_bachelor/website-metadata-extractor

🔍 Website Metadata Extractor 🌐 Extract essential website data: meta tags, robots.txt, and sitemap.xml in one scan. 📊 Analyze SEO elements, crawler directives, and site structure. ✅ Perfect for SEO audits, 🔎 competitor research, and 🚀 understanding how search engines view your website.

Powerful Bachelor

Ai Ready Web Page To Markdown Converter

mustafa.irshaid.113/ai-ready-web-page-to-markdown-converter

Convert any webpage into structured Markdown and HTML using just a URL. Get the page title, link, and content—perfect for SEO, devs, and AI crawlers. Fast, clean, and ideal for repurposing or analysis. Start turning websites into Markdown instantly.

Mustafa Irshaid

Website to MarkDown (AI-Ready)

mintii/website-to-markdown-ai-ready

Use this to scrape webpages and use for AI Tools and LLMs.

Martin from Mintii

URL Metadata Crawler

easyapi/url-metadata-crawler

Extracting comprehensive metadata from web pages. Gather vital information like meta tags, favicons, Open Graph tags, and more, all while enjoying flexible options for customization. Perfect for SEO specialists, developers, and content creators looking to enhance their web presence! 🌐