Sitemap Url Extractor
Pricing
Pay per usage
Sitemap Url Extractor
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Donny Nguyen
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
29 minutes ago
Last modified
Categories
Share
What does Sitemap URL Extractor do?
Sitemap URL Extractor is an Apify actor that parses XML sitemaps and extracts all listed URLs along with their metadata such as last modification date, priority, and change frequency. It handles standard sitemaps, sitemap indexes, and nested sitemaps automatically. Feed it any sitemap URL like https://crawlee.dev/sitemap.xml and get a clean, structured list of every page the site exposes.
Why use Sitemap URL Extractor?
- Bulk URL discovery -- Instantly extract thousands of URLs from any XML sitemap without manual parsing.
- Complete metadata -- Get
lastmod,priority, andchangefreqvalues alongside every URL for SEO analysis. - Sitemap index support -- Automatically follows sitemap indexes to extract URLs from all child sitemaps.
- API integration -- Retrieve results programmatically via the Apify API for use in crawling pipelines and SEO tools.
- Proxy support -- Leverages Apify Proxy to access sitemaps behind geo-restrictions or rate limits.
How to use Sitemap URL Extractor
- Visit the Apify Store and find Sitemap URL Extractor.
- Click Try for free to open the actor configuration page.
- Enter one or more XML sitemap URLs in the Sitemap URLs field.
- Optionally set a Max URLs limit to cap the number of extracted URLs.
- Click Start, then download the dataset as JSON, CSV, or Excel.
Input configuration
| Field | Type | Description | Default |
|---|---|---|---|
sitemapUrls | Array of strings | XML sitemap URLs to parse | ["https://crawlee.dev/sitemap.xml"] |
maxUrls | Integer | Maximum number of URLs to extract | 10000 |
Output data
Each record in the output dataset represents a single URL found in the sitemap. Metadata fields are included when available in the source XML.
{"url": "https://crawlee.dev/docs/introduction","lastmod": "2025-11-15","priority": "0.8","changefreq": "weekly","sitemapSource": "https://crawlee.dev/sitemap.xml"}
Cost of usage
Sitemap URL Extractor uses pay-per-event (PPE) pricing at the Utility tier:
| Tier | Cost per 1,000 events | Free events per month |
|---|---|---|
| Utility | $0.30 | ~16,600 |
Parsing a single sitemap with 500 URLs costs approximately 1 event. Even large-scale extraction of tens of thousands of URLs stays well within the generous free tier allowance, making this one of the most cost-effective actors available.
Tips and advanced usage
- Feed into a crawler -- Use extracted URLs as the start URL list for another Apify actor or a Crawlee-based scraper.
- Schedule weekly extractions -- Set up scheduled runs to track when new pages appear on a competitor's site.
- Filter by priority -- After extraction, filter the dataset by
priorityto focus on the most important pages. - Monitor sitemap freshness -- Compare
lastmoddates across runs to detect stale or outdated content. - Combine with SEO tools -- Pair the output with a keyword rank checker or backlink analyzer for a comprehensive SEO audit.
Built with Crawlee and Apify SDK. See more scrapers by consummate_mandala on Apify Store.