Sitemap Extractor
Pricing
from $20.00 / 1,000 successful api calls
Sitemap Extractor
Extract all URLs from a website's sitemap (XML, robots.txt, or crawl discovery).
Pricing
from $20.00 / 1,000 successful api calls
Rating
0.0
(0)
Developer
Alex Jordan
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
What does Sitemap Extractor do?
Sitemap Extractor discovers and extracts all URLs from a website's sitemap — supporting XML sitemaps, sitemap index files, robots.txt discovery, and automatic crawl-based detection — returning a clean list of page URLs ready for bulk scraping or SEO analysis.
Built on the Apify platform, results are returned in seconds and integrate with Apify's scheduling, webhooks, and 1,500+ tools.
Why use Sitemap Extractor?
- Bulk scraping setup — Get the full list of URLs on a site to feed into other scraping workflows
- SEO auditing — Audit all indexed pages across a domain for on-page SEO issues
- Content monitoring — Track new pages added to competitor sites over time with scheduled runs
- Website migration — Extract all source URLs before a site migration to verify redirects
- Competitor analysis — Understand the full content architecture of any competitor website
How to use Sitemap Extractor
- Click Try for free on this Actor's page
- Enter the website root URL or a direct sitemap URL (e.g.
https://stripe.comorhttps://stripe.com/sitemap.xml) - Set max_urls to limit results if needed (default 1000)
- Click Start and wait a few seconds
- Download your results from the Output tab in JSON, CSV, or Excel
Input
| Field | Type | Required | Description |
|---|---|---|---|
url | string | ✅ | Website root URL or direct sitemap URL |
max_urls | integer | ❌ | Maximum number of URLs to return (default 1000) |
cache | boolean | ❌ | Use cached result if available (default true) |
Example input:
{"url": "https://stripe.com","max_urls": 500}
Output
Example output:
{"urls": ["https://stripe.com/","https://stripe.com/payments","https://stripe.com/billing","https://stripe.com/docs"],"total_urls": 487,"sitemap_source": "https://stripe.com/sitemap.xml","meta": { "cache_hit": false, "execution_time_ms": 1100 }}
You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.
Data fields
| Field | Type | Description |
|---|---|---|
urls | array | List of all discovered page URLs |
total_urls | integer | Total number of URLs found |
sitemap_source | string | The sitemap URL that was ultimately used |
Pricing / Cost estimation
$0.02 per successful API call on Apify.
- 1,000 successful Apify runs = $20.00
FAQ & Support
Is this legal? Sitemaps are intentionally published by website owners to help search engines index their pages — reading them is fully legitimate.
Known limitations: Password-protected sitemaps or those requiring authentication cannot be accessed. Very large sitemaps (100k+ URLs) will be truncated to max_urls.
Need help? Open an issue in the Issues tab or contact the support team for custom solutions.