Pricing

Pay per usage

Go to Apify Store

Sitemap Content Crawler

Try for free

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

Sitemap Change Orchestrator

tri_angle/sitemap-change-orchestrator

Monitor website sitemaps for new, updated, or removed URLs. Integration with the Website Content Crawler (WCC) allows feeding only relevant URLs. This ensures your web crawls are efficient, targeted, and resource-optimized, keeping your datasets fresh for any application.

Tri⟁angle

Updated Content Checker

tomas.gabik/updated-content-checker

Monitors sitemaps for new/updated content. Returns only URLs modified since a specified date for efficient incremental scraping.

Tomáš Gabík

Website Content Crawler

apify/website-content-crawler

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Apify

102K

4.5

(167)

Website Metadata Extractor (meta tags, sitemap, robots) 🔎

powerful_bachelor/website-metadata-extractor

🔍 Website Metadata Extractor 🌐 Extract essential website data: meta tags, robots.txt, and sitemap.xml in one scan. 📊 Analyze SEO elements, crawler directives, and site structure. ✅ Perfect for SEO audits, 🔎 competitor research, and 🚀 understanding how search engines view your website.

Powerful Bachelor

CNN Article Scraper

filip_cicvarek/cnn-article-scraper

Extract CNN articles by category or search query with date filtering. Scrape news from politics, business, world, tech, sports, and more. Get structured data: title, author, publication date, full content. Perfect for media monitoring, research, and content analysis.

Filip Cicvárek

5.0

(3)

Website Content Crawler

alizarin_refrigerator-owner/website-crawler

Crawl websites for SEO audits. Extracts HTML, title, meta tags, headings, links, & text content from pages. Automatic sitemap detection & parsing Extracts metadata (title, description, OG tags) Heading structure (H1, H2, H3) Internal & external link analysis Image extraction w/alt text Word count

John Rippy

SEO Data Extractor

nocodeventure/seo-data-extractor

Extract comprehensive SEO metadata, headings, links, images, Open Graph tags, Twitter Cards, and technical data from websites. Perfect for SEO audits, competitor analysis, and content optimization. Runs on Apify platform with structured JSON output.

No-Code Venture

Website Content Crawler for LLM's

salesblaster-ai/website-content-crawler

Extract contact information + turn any website into clean, structured content ready for LLM's (e.g. AI lead magnets, RAG pipelines, and outbound personalization). Most web scrapers dump raw HTML or unstructured text. This crawler is purpose-built for LLM's, and optimized for lead generation.

SalesBlaster AI

Website Content Crawler Rag

tropical_quince/website-content-crawler-rag

Donny Nguyen

Sqoosh Image Compressor

eunit/sqoosh-image-compressor

Optimize images for SEO with the Squoosh Actor on Apify. Batch compress, resize, and convert images to WebP, AVIF, and MozJPEG to boost site speed and Core Web Vitals. Automate high-performance image optimization for web scraping and developer workflows with ease.