Extract Sitemap Parser Url — URLs, Structure & Metadata
Pricing
Pay per usage
Extract Sitemap Parser Url — URLs, Structure & Metadata
Extract sitemap parser url data at scale with this powerful Apify actor. Extracts urls, structure & metadata with automatic pagination and proxy rotation. Perfect for market research, competitive intelligence, and data-driven decision making.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Donny Nguyen
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
14 hours ago
Last modified
Share
Sitemap Parser - Extract All URLs from XML Sitemaps
Parse any XML sitemap and extract every URL with metadata (lastmod, changefreq, priority). Handles sitemap indexes, auto-discovers sitemaps from robots.txt, and classifies pages by type (blog, product, category, etc.).
What does Sitemap Parser do?
Give it a sitemap URL or just a website URL — it will find the sitemap automatically via robots.txt, parse it (including nested sitemap indexes), and return every URL with its metadata. Each URL is enriched with domain info, path depth, and automatic page type classification.
Key features:
- ✅ Auto-discovers sitemaps from robots.txt
- ✅ Handles sitemap indexes (nested sitemaps)
- ✅ Extracts lastmod, changefreq, priority metadata
- ✅ Classifies pages: homepage, blog, product, category, support, etc.
- ✅ Filter/exclude URLs by pattern
How much does it cost?
- Pricing: Pay per result — $0.50 per 1,000 URLs extracted
- Typical use: Parsing a 5,000-URL sitemap ≈ $2.50
Input example
{"sitemapUrls": ["https://example.com"],"autoDiscoverSitemap": true,"maxUrls": 10000}
Output example
{"url": "https://example.com/blog/post-title","lastmod": "2025-01-10","changefreq": "weekly","priority": "0.8","domain": "example.com","path": "/blog/post-title","depth": 2,"pageType": "blog"}
Use cases
- SEO auditing: Map a site's full URL structure and find indexation issues
- Content inventory: List every page on a website for migration planning
- Competitive analysis: See how competitors organize their content
- Data pipelines: Feed URL lists into other scrapers or crawlers
Built by Donny Dev