Pricing

from $3.72 / 1,000 pages

Website Content Extractor

Crawl public pages and extract page titles, meta descriptions, headings, readable text, source URLs, and crawl metadata.

Pricing

from $3.72 / 1,000 pages

Rating

0.0

(0)

Developer

Ushba Khan

Actor stats

Bookmarked

Total users

Monthly active users

25 days ago

Last modified

What You Get

page title, meta description, H1/H2/H3 headings, readable body text, and final URL
crawl depth controls, same-domain filtering, and page limit settings
clean content rows for SEO audits, research, AI preprocessing, and archiving

Best For

lead generation, research, monitoring, enrichment, and reporting workflows
exporting clean rows to CSV, Excel, JSON, APIs, CRMs, or automation tools
scheduled runs where predictable output and clear result limits matter

How To Use

Add the public URLs, keywords, locations, handles, or settings required by the input form.
Set the result limit to match the number of rows you want to pay for.
Run the actor once for a sample, then schedule it if you need monitoring.
Export the dataset or connect it to your workflow through the Apify API or integrations.

Output

The default dataset returns structured rows using the fields listed above. Empty, blocked, or failed targets are handled clearly so downstream tools can filter results without guessing.

Notes

Works with public data that the target website exposes during the run.
Uses result caps and error handling to avoid runaway runs.
Private, login-only, or heavily blocked pages may return fewer rows than requested.

Sitemap Content Extractor

darknezz/sitemap-content-extractor

Crawl any website sitemap.xml and extract structured content from each page. Full-text extraction, metadata, headings, and word counts for SEO audits and content inventories.

Oaida Adrian

Crawl4ai

kael_odin/crawl4ai

Extract page content (markdown/HTML/text), metadata, and link stats. Uses crawl4ai.

Kael Odin

Website Content Miner

seeb/website-content-miner

Extract clean website content at scale: page titles, meta descriptions, H1-H3 headings, readable main text, and URLs. Includes smart noise removal, Readability fallback, optional internal crawling, and structured output for SEO audits, AI datasets, research, and automation.

Techionik

5.0

Webpage Content & Metadata Extractor

aetheragent/webpage-content-extractor

Extract the full content, metadata, and structure from any webpage. Get Open Graph tags, Twitter cards, JSON-LD structured data, meta tags, all images with alt text, headings hierarchy, and clean readable text. Perfect for content research, competitive analysis, and data collection.

Grant Mitchell

Website Metadata Extractor

scrapers-hub/website-metadata-extractor

Website metadata extractor to extract titles, descriptions, keywords, and meta tags from any website 🌐📊 Perfect for SEO analysis, auditing, and research. Fast, accurate, and scalable extraction.

Scrapers Hub

Website Crawler

elcon/website-crawler

Crawls a website starting from one or more URLs and extracts the title, meta description, headings and text from each page.

elcon software

Keywords Extractor

lukaskrivka/keywords-extractor

Use our free website keyword extractor to crawl any website and extract keyword counts on each page.

Lukáš Křivka

846

4.8

Website Content Crawler

parseforge/website-content-crawler

Crawl any website and pull clean Markdown content ready for AI! Follow links across a whole domain and extract page text, titles, headings, images, and metadata. Perfect for building RAG pipelines, training datasets, knowledge bases, and vector databases. Start crawling content in minutes!

ParseForge

Website URL Crawler & Link Extractor

maximedupre/website-url-crawler

Crawl JavaScript-rendered websites and export a URL link map. Get source pages, depth, anchor text, link type, HTTP metadata, and crawl status.

Maxime Dupré

SEO Audit Scraper

actorfordge/seo-audit-scraper

Crawl any website and run an on-page SEO audit. This Actor checks page titles, meta descriptions, H1/H2 headings, word count, canonical URLs, image alt text, internal links, external links, and gives each page an SEO score with clear issues found.