Pricing

Pay per usage

SEO Content Extraction

Extract SEO-ready content from public web pages with robots.txt checks, strict limits, SSRF protection, and clean structured output.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

ping

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

What It Returns

Each dataset item contains:

url and finalUrl
HTTP statusCode and contentType
title
meta description
headings (h1, h2, h3)
cleaned text
normalized outbound links

Example Input

{
  "startUrls": ["https://example.com"],
  "maxPages": 3,
  "maxDepth": 1,
  "sameDomainOnly": true,
  "respectRobotsTxt": true,
  "includeLinks": true,
  "textMaxChars": 4000
}

Input Notes

startUrls: 1 to 10 public HTTP/HTTPS URLs.
maxPages: 1 to 25 pages per run.
maxDepth: 0 to 3 link-following depth.
sameDomainOnly: enabled by default.
respectRobotsTxt: enabled by default.
includeHtml: disabled by default.

Good Uses

SEO page inventory
Title, meta description, and heading extraction
Lightweight content checks for public websites
RAG and agent data collection from public pages
Internal link discovery within a small site section

Security And Privacy

The Actor blocks:

localhost and private network targets
link-local and metadata IP targets
special-use hostnames such as .local and .internal
URLs with embedded credentials
shell/process/proxy override fields in input JSON
script-like input strings

The Actor does not accept custom proxy settings, shell commands, environment variables, worker URLs, or worker tokens from callers.

Limitations

This is a public-page content extractor. It is not a browser automation Actor, does not render JavaScript-only content, and is not designed for login-only sites, CAPTCHA flows, anti-bot bypass, or high-volume harvesting.

Ai Seo Content

vivid_astronaut/ai-seo-content

Fabio Suizu

Sitemap, robots.txt & RSS Change Monitor

enfex/source-change-monitor

Monitor public robots.txt, sitemap XML, and RSS/Atom source changes for SEO and content intelligence without browser automation.

Marcel K

SEO Analyzer

optimus-fulcria/seo-analyzer

Analyze web pages for SEO factors. Get an SEO score (0-100) with actionable recommendations for meta tags, headings, images, links, content quality, and technical SEO.

Fulcria Labs

robots.txt Analyzer

eliai/robots-txt-analyzer

Anthony Snider

Robots Indexability Audit

toronto_777/robots-indexability-audit

Audit public robots.txt, sitemap declarations, homepage robots directives, and crawler allow/block signals for SEO and AI visibility checks.

Steven Feng

Web Article Content Extractor

vulnv/web-article-content-extractor

Extract clean, readable content from news articles, blog posts, and web pages. Batch process multiple URLs, download images, bypass bot protection with proxy support. Perfect for content curation, research, and data analysis.

VulnV

Sitemap & Robots SEO Index Auditor

glowing_glove/sitemap-robots-index-auditor

Audit robots.txt, sitemap discovery, indexability signals, canonical tags, and SEO crawl readiness for business websites.

Ushba Khan

Robots.txt Auditor & Sitemap Finder

andok/robotstxt-auditor

Scan robots.txt files in bulk to extract sitemap URLs and verify crawler directives for technical SEO compliance.

Andok

Supplier Catalog & Price List Extractor

enfex/supplier-catalog-price-list-extractor

Extract normalized products, SKUs, and prices from authorized public HTTPS catalog pages and PDF price lists with SSRF protection.

Marcel K

SEO Data Extractor

nocodeventure/seo-data-extractor

Extract comprehensive SEO metadata, headings, links, images, Open Graph tags, Twitter Cards, and technical data from websites. Perfect for SEO audits, competitor analysis, and content optimization. Runs on Apify platform with structured JSON output.