Sitemap Url Extractor avatar

Sitemap Url Extractor

Pricing

Pay per usage

Go to Apify Store
Sitemap Url Extractor

Sitemap Url Extractor

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Donny Nguyen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

29 minutes ago

Last modified

Categories

Share

What does Sitemap URL Extractor do?

Sitemap URL Extractor is an Apify actor that parses XML sitemaps and extracts all listed URLs along with their metadata such as last modification date, priority, and change frequency. It handles standard sitemaps, sitemap indexes, and nested sitemaps automatically. Feed it any sitemap URL like https://crawlee.dev/sitemap.xml and get a clean, structured list of every page the site exposes.

Why use Sitemap URL Extractor?

  • Bulk URL discovery -- Instantly extract thousands of URLs from any XML sitemap without manual parsing.
  • Complete metadata -- Get lastmod, priority, and changefreq values alongside every URL for SEO analysis.
  • Sitemap index support -- Automatically follows sitemap indexes to extract URLs from all child sitemaps.
  • API integration -- Retrieve results programmatically via the Apify API for use in crawling pipelines and SEO tools.
  • Proxy support -- Leverages Apify Proxy to access sitemaps behind geo-restrictions or rate limits.

How to use Sitemap URL Extractor

  1. Visit the Apify Store and find Sitemap URL Extractor.
  2. Click Try for free to open the actor configuration page.
  3. Enter one or more XML sitemap URLs in the Sitemap URLs field.
  4. Optionally set a Max URLs limit to cap the number of extracted URLs.
  5. Click Start, then download the dataset as JSON, CSV, or Excel.

Input configuration

FieldTypeDescriptionDefault
sitemapUrlsArray of stringsXML sitemap URLs to parse["https://crawlee.dev/sitemap.xml"]
maxUrlsIntegerMaximum number of URLs to extract10000

Output data

Each record in the output dataset represents a single URL found in the sitemap. Metadata fields are included when available in the source XML.

{
"url": "https://crawlee.dev/docs/introduction",
"lastmod": "2025-11-15",
"priority": "0.8",
"changefreq": "weekly",
"sitemapSource": "https://crawlee.dev/sitemap.xml"
}

Cost of usage

Sitemap URL Extractor uses pay-per-event (PPE) pricing at the Utility tier:

TierCost per 1,000 eventsFree events per month
Utility$0.30~16,600

Parsing a single sitemap with 500 URLs costs approximately 1 event. Even large-scale extraction of tens of thousands of URLs stays well within the generous free tier allowance, making this one of the most cost-effective actors available.

Tips and advanced usage

  • Feed into a crawler -- Use extracted URLs as the start URL list for another Apify actor or a Crawlee-based scraper.
  • Schedule weekly extractions -- Set up scheduled runs to track when new pages appear on a competitor's site.
  • Filter by priority -- After extraction, filter the dataset by priority to focus on the most important pages.
  • Monitor sitemap freshness -- Compare lastmod dates across runs to detect stale or outdated content.
  • Combine with SEO tools -- Pair the output with a keyword rank checker or backlink analyzer for a comprehensive SEO audit.

Built with Crawlee and Apify SDK. See more scrapers by consummate_mandala on Apify Store.