Sitemap Extractor avatar

Sitemap Extractor

Pricing

from $20.00 / 1,000 successful api calls

Go to Apify Store
Sitemap Extractor

Sitemap Extractor

Extract all URLs from a website's sitemap (XML, robots.txt, or crawl discovery).

Pricing

from $20.00 / 1,000 successful api calls

Rating

0.0

(0)

Developer

Alex Jordan

Alex Jordan

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

What does Sitemap Extractor do?

Sitemap Extractor discovers and extracts all URLs from a website's sitemap — supporting XML sitemaps, sitemap index files, robots.txt discovery, and automatic crawl-based detection — returning a clean list of page URLs ready for bulk scraping or SEO analysis.

Built on the Apify platform, results are returned in seconds and integrate with Apify's scheduling, webhooks, and 1,500+ tools.

Why use Sitemap Extractor?

  • Bulk scraping setup — Get the full list of URLs on a site to feed into other scraping workflows
  • SEO auditing — Audit all indexed pages across a domain for on-page SEO issues
  • Content monitoring — Track new pages added to competitor sites over time with scheduled runs
  • Website migration — Extract all source URLs before a site migration to verify redirects
  • Competitor analysis — Understand the full content architecture of any competitor website

How to use Sitemap Extractor

  1. Click Try for free on this Actor's page
  2. Enter the website root URL or a direct sitemap URL (e.g. https://stripe.com or https://stripe.com/sitemap.xml)
  3. Set max_urls to limit results if needed (default 1000)
  4. Click Start and wait a few seconds
  5. Download your results from the Output tab in JSON, CSV, or Excel

Input

FieldTypeRequiredDescription
urlstringWebsite root URL or direct sitemap URL
max_urlsintegerMaximum number of URLs to return (default 1000)
cachebooleanUse cached result if available (default true)

Example input:

{
"url": "https://stripe.com",
"max_urls": 500
}

Output

Example output:

{
"urls": [
"https://stripe.com/",
"https://stripe.com/payments",
"https://stripe.com/billing",
"https://stripe.com/docs"
],
"total_urls": 487,
"sitemap_source": "https://stripe.com/sitemap.xml",
"meta": { "cache_hit": false, "execution_time_ms": 1100 }
}

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

Data fields

FieldTypeDescription
urlsarrayList of all discovered page URLs
total_urlsintegerTotal number of URLs found
sitemap_sourcestringThe sitemap URL that was ultimately used

Pricing / Cost estimation

$0.02 per successful API call on Apify.

  • 1,000 successful Apify runs = $20.00

FAQ & Support

Is this legal? Sitemaps are intentionally published by website owners to help search engines index their pages — reading them is fully legitimate.

Known limitations: Password-protected sitemaps or those requiring authentication cannot be accessed. Very large sitemaps (100k+ URLs) will be truncated to max_urls.

Need help? Open an issue in the Issues tab or contact the support team for custom solutions.