Sitemap Extractor avatar

Sitemap Extractor

Pricing

Pay per usage

Go to Apify Store
Sitemap Extractor

Sitemap Extractor

Extract all URLs from website sitemaps. Pages, images, PDFs. Handles sitemap indexes and WordPress.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Benny

Benny

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Categories

Share

Sitemap URL Extractor

Give it websites. Get back every URL from their sitemap.

What it does

  1. Checks common sitemap locations (/sitemap.xml, /wp-sitemap.xml, etc.)
  2. Reads robots.txt for sitemap declarations
  3. Follows sitemap indexes to sub-sitemaps
  4. Deduplicates and categorizes URLs (pages, images, PDFs)

Input

{
"websites": ["https://stripe.com", "https://github.com"],
"maxUrls": 10000
}

Output

{
"website": "https://stripe.com",
"domain": "stripe.com",
"totalUrls": 847,
"pages": 612,
"images": 235,
"pdfs": 0,
"urls": ["https://stripe.com/payments", "..."],
"imageUrls": ["..."],
"pdfUrls": []
}

Use cases

  • SEO audits: know every URL a site has indexed
  • Competitive analysis: map a competitor's content
  • Migration planning: get a full URL inventory before redesigning
  • Content analysis: understand site structure