Sitemap Diff Tool avatar

Sitemap Diff Tool

Pricing

Pay per event

Go to Apify Store
Sitemap Diff Tool

Sitemap Diff Tool

Compare two XML sitemaps and find added, removed, or changed URLs. Detects lastmod, priority, and changefreq changes. Supports sitemap index files. Export results as JSON, CSV, or Excel.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Categories

Share

What does Sitemap Diff Tool do?

Sitemap Diff Tool compares two XML sitemaps and returns a structured list of changes — URLs added, removed, or modified between the two versions. It fetches both sitemaps via HTTP, parses the XML, and diffs the URL sets including metadata fields like lastmod, priority, and changefreq.

Feed it two sitemap URLs (e.g., your current sitemap and a snapshot from last week) and get back every change in a clean, filterable dataset. It supports sitemap index files — all child sitemaps are fetched automatically.

Try it on the Apify Store page →

Who is Sitemap Diff Tool for?

SEO Managers and Content Strategists who monitor site health:

  • 📉 Detect when important category or product pages get removed from the sitemap accidentally
  • 🔄 Confirm that new content is correctly indexed via the sitemap after publish
  • 📋 Audit sitemap changes before and after CMS migrations

DevOps and Site Reliability Engineers automating deployments:

  • ✅ Validate that a deployment didn't accidentally de-index critical pages
  • 🔔 Build automated alerts when sitemap structure changes unexpectedly
  • 🧪 Compare staging vs production sitemaps to catch regressions

Digital Agencies managing multiple client websites:

  • 📊 Schedule weekly sitemap checks across all client domains
  • 🗂️ Export change reports to Google Sheets for client review
  • 🔁 Automate SEO monitoring pipelines via API and webhooks

E-commerce Teams protecting product visibility:

  • 🛍️ Detect product pages removed from sitemap (SEO traffic risk)
  • 💹 Monitor lastmod changes to see which products were recently updated
  • 🏷️ Track priority changes that affect how search engines crawl your site

Why use Sitemap Diff Tool?

  • Pure HTTP actor — no browser overhead, no proxy required for most sitemaps, runs fast
  • Sitemap index support — auto-follows sitemap index files and collects all child sitemaps
  • Metadata diffing — detects changes to lastmod, priority, and changefreq, not just URL presence
  • Structured output — every change has a changeType field (added, removed, changed, unchanged) for easy filtering
  • Pay-per-event pricing — pay only for the URLs you compare, no monthly subscription
  • API & scheduling — integrate via Apify API, schedule weekly checks, trigger webhooks on changes
  • Fast — compares thousands of URLs in seconds using efficient set operations

What data can you extract?

Each result in the dataset represents one URL from either sitemap:

FieldTypeDescription
changeTypestringOne of: added, removed, changed, unchanged
urlstringThe full URL from the sitemap
lastmodAstringlastmod value in sitemap A (baseline)
lastmodBstringlastmod value in sitemap B (current)
priorityAnumberpriority in sitemap A (0.0–1.0)
priorityBnumberpriority in sitemap B (current)
changefreqAstringchangefreq in sitemap A (e.g., weekly)
changefreqBstringchangefreq in sitemap B
changedFieldsarrayFor changed entries: which metadata fields changed

The summary (saved to the key-value store as SUMMARY) includes overall statistics:

StatDescription
totalUrlsATotal URLs parsed from sitemap A
totalUrlsBTotal URLs parsed from sitemap B
addedURLs present in B but not in A
removedURLs present in A but not in B
changedURLs in both sitemaps with metadata changes
unchangedURLs present in both with identical metadata
totalComparedTotal unique URLs across both sitemaps

How much does it cost to compare sitemaps?

This Actor uses pay-per-event pricing — you pay only for URLs compared. No monthly subscription. All platform costs are included.

FreeStarter ($29/mo)Scale ($199/mo)Business ($999/mo)
Per URL compared$0.00115$0.001$0.00078$0.0006
1,000 URLs$1.15$1.00$0.78$0.60
10,000 URLs$11.50$10.00$7.80$6.00

Plus a one-time $0.005 start fee per run.

Real-world cost examples:

Sitemaps comparedTotal URLsDurationCost (Free tier)
Small blog vs same (100 URLs each)~100~3s~$0.12
Medium site (1,000 URLs each)~2,000~8s~$2.31
Large e-commerce (10,000 URLs each)~20,000~25s~$23.01

Free tier estimate: On Apify's $5 free credit, you can compare ~4,300 unique URLs.

How to compare two sitemaps

  1. Go to the Sitemap Diff Tool page on Apify Store
  2. Click Try for free (no credit card required for the free tier)
  3. Enter your Sitemap URL A (the baseline/older version)
  4. Enter your Sitemap URL B (the current/newer version)
  5. Configure output options (include unchanged URLs, compare metadata)
  6. Click Start and wait for the run to complete
  7. Download results as JSON, CSV, or Excel from the dataset tab

Tip: To monitor sitemap changes over time, schedule the actor to run daily or weekly via Apify's built-in scheduler.

Example JSON inputs:

{
"sitemapUrlA": "https://example.com/sitemap-2024-01-01.xml",
"sitemapUrlB": "https://example.com/sitemap.xml",
"compareMetadata": true,
"includeUnchanged": false
}

Only compare URL presence (ignore metadata):

{
"sitemapUrlA": "https://staging.example.com/sitemap.xml",
"sitemapUrlB": "https://example.com/sitemap.xml",
"compareMetadata": false,
"includeUnchanged": false
}

Cap large sitemaps:

{
"sitemapUrlA": "https://example.com/sitemap.xml",
"sitemapUrlB": "https://example.com/sitemap.xml",
"maxUrlsPerSitemap": 5000,
"compareMetadata": true
}

Input parameters

ParameterTypeRequiredDefaultDescription
sitemapUrlAstringURL of the baseline (old) sitemap
sitemapUrlBstringURL of the current (new) sitemap
includeUnchangedbooleanfalseInclude URLs with no changes in output
compareMetadatabooleantrueDetect lastmod/priority/changefreq changes
maxUrlsPerSitemapinteger0 (unlimited)Cap URLs per sitemap for large sites

Output examples

Added URL:

{
"changeType": "added",
"url": "https://example.com/new-product",
"lastmodB": "2026-04-01",
"priorityB": 0.8,
"changefreqB": "weekly"
}

Removed URL:

{
"changeType": "removed",
"url": "https://example.com/old-product",
"lastmodA": "2025-12-15",
"priorityA": 0.5,
"changefreqA": "monthly"
}

Changed URL (metadata updated):

{
"changeType": "changed",
"url": "https://example.com/homepage",
"lastmodA": "2025-11-01",
"lastmodB": "2026-04-07",
"priorityA": 0.9,
"priorityB": 1.0,
"changefreqA": "weekly",
"changefreqB": "daily",
"changedFields": ["lastmod", "priority", "changefreq"]
}

Run summary (saved to key-value store as SUMMARY):

{
"sitemapUrlA": "https://example.com/sitemap-old.xml",
"sitemapUrlB": "https://example.com/sitemap.xml",
"stats": {
"totalUrlsA": 1200,
"totalUrlsB": 1245,
"added": 52,
"removed": 7,
"changed": 38,
"unchanged": 1148,
"totalCompared": 1252
},
"completedAt": "2026-04-08T10:23:45.000Z"
}

Tips for best results

  • 🗂️ Start with a snapshot — before making site changes, run the actor once to capture the current sitemap as a baseline. Store the output URL for future comparisons.
  • 📅 Schedule weekly checks — use Apify Scheduler to run this actor every Monday morning. Set a webhook to notify Slack when added or removed counts are non-zero.
  • 🔢 Use maxUrlsPerSitemap for large sites — sitemaps with 50,000+ URLs are valid but slow to fetch. Cap at 10,000 for faster monitoring runs.
  • 🧩 Compare staging vs production — before deploying a site update, run a diff to confirm the sitemap changes match your intent.
  • 📤 Filter by changeType — in the dataset viewer, filter to changeType = removed to quickly see URLs that disappeared.
  • 🔄 Combine with link checkers — once you know which URLs were removed, pass them through a link checker actor to verify they return 301 redirects.
  • ⚠️ Sitemap index files are supported — if your sitemap URL returns <sitemapindex> XML, all child sitemaps are fetched automatically (up to 3 levels deep).

Integrations

Sitemap Diff Tool integrates with Apify's ecosystem for automated SEO monitoring workflows:

Sitemap Diff Tool → Google Sheets via Make/Zapier After each run, export the dataset to Google Sheets. Set up a Make scenario that filters changeType = removed and emails your team when pages disappear.

Sitemap Diff Tool → Slack alerts Use Apify webhooks to POST the run summary to a Slack channel on completion. Your SEO team sees same-day alerts when critical URLs drop off the sitemap.

Sitemap Diff Tool → Scheduled weekly audit Use Apify's built-in scheduler (no external tool needed). Schedule a weekly run comparing a pinned baseline URL with the live sitemap. Review dataset changes in the Apify Console.

Sitemap Diff Tool → CI/CD pipeline via API Call the actor via Apify API in your deployment pipeline. If removed count exceeds a threshold, fail the deployment and notify the team.

Sitemap Diff Tool → Data warehouses (BigQuery, Snowflake) Export datasets to your data warehouse via Apify's direct integrations. Track sitemap change history over months to identify SEO trends.

Using the Apify API

Run the actor programmatically and integrate it into your SEO monitoring stack.

Node.js (apify-client):

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('automation-lab/sitemap-diff-tool').call({
sitemapUrlA: 'https://example.com/sitemap-old.xml',
sitemapUrlB: 'https://example.com/sitemap.xml',
compareMetadata: true,
includeUnchanged: false,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const removed = items.filter(i => i.changeType === 'removed');
console.log(`${removed.length} URLs removed from sitemap`);

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("automation-lab/sitemap-diff-tool").call(run_input={
"sitemapUrlA": "https://example.com/sitemap-old.xml",
"sitemapUrlB": "https://example.com/sitemap.xml",
"compareMetadata": True,
})
items = client.dataset(run["defaultDatasetId"]).list_items().items
removed = [i for i in items if i["changeType"] == "removed"]
print(f"{len(removed)} URLs removed from sitemap")

cURL:

curl -X POST "https://api.apify.com/v2/acts/automation-lab~sitemap-diff-tool/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sitemapUrlA": "https://example.com/sitemap-old.xml",
"sitemapUrlB": "https://example.com/sitemap.xml",
"compareMetadata": true
}'

Find your API token at console.apify.com/settings/integrations.

Use with AI agents via MCP

Sitemap Diff Tool is available as a tool for AI assistants that support the Model Context Protocol (MCP).

Add the Apify MCP server to your AI client — this gives you access to all Apify actors, including this one:

Setup for Claude Code

$claude mcp add --transport http apify "https://mcp.apify.com"

Setup for Claude Desktop, Cursor, or VS Code

Add this to your MCP config file:

{
"mcpServers": {
"apify": {
"url": "https://mcp.apify.com"
}
}
}

Your AI assistant will use OAuth to authenticate with your Apify account on first use.

Example prompts

Once connected, try asking your AI assistant:

  • "Use automation-lab/sitemap-diff-tool to compare https://example.com/sitemap-2025.xml with https://example.com/sitemap.xml and show me the removed URLs"
  • "Check if any URLs were added or removed from my sitemap this week by comparing the live sitemap with last week's version"
  • "Run a sitemap diff between our staging site and production and list pages that are in production but missing from staging"

Learn more in the Apify MCP documentation.

Yes. XML sitemaps are public files intentionally published by website owners to help search engines discover their content. Fetching and comparing public sitemap files is entirely legal and is standard SEO practice used by tools like Screaming Frog, Sitebulb, and Google Search Console.

This actor only accesses publicly available sitemap URLs — it does not crawl page content, bypass authentication, or interact with private resources. Always ensure your use complies with the website's terms of service and robots.txt.

FAQ

How fast does it compare sitemaps? Very fast — for two 1,000-URL sitemaps, a typical run takes under 10 seconds. The actor fetches both sitemaps in parallel and uses O(n) set operations for diffing. For large sitemaps (50,000+ URLs), expect 30–60 seconds.

What's the cost to compare a 10,000-URL sitemap? Comparing two 10,000-URL sitemaps = 20,000 unique URLs total. At Free tier ($0.00115/URL) = ~$23 + $0.005 start = ~$23.01. On Starter plan ($0.001/URL) = ~$20.01.

Is there an official Apify sitemap comparison tool or API? No — Apify does not provide a built-in sitemap diff feature. This actor fills that gap for SEO and DevOps teams. Unlike manual comparisons using spreadsheet formulas, this actor handles sitemap index files, metadata changes, and large URL sets automatically.

Why are results empty even though both sitemaps have URLs? This can happen if your sitemaps use non-standard XML formatting. The actor uses regex-based parsing tuned for standard <url> and <sitemap> tags. If your sitemap uses unusual namespaces or custom tags, the parser may miss entries. Try fetching the sitemap directly in a browser and checking its XML structure.

Why do some URLs show as both added and removed instead of changed? URL comparison is done by exact string match. If a URL changed from http:// to https:// or gained/lost a trailing slash, it appears as a remove + add pair rather than a single "changed" entry. This is intentional — the URL itself changed, which is significant for SEO redirects.

Can I compare very large sitemaps (100,000+ URLs)? Yes, but you may want to use maxUrlsPerSitemap to cap the comparison. The actor runs in 256 MB memory and handles tens of thousands of URLs comfortably. For 100K+ URL sitemaps, memory may become a constraint — consider processing in batches using maxUrlsPerSitemap.

Does it support password-protected or authenticated sitemaps? No — this actor only fetches publicly accessible sitemaps. If your sitemap requires authentication, you would need to pre-download it and serve it from a public URL first.

Other SEO and utility tools

Explore other tools from automation-lab for your SEO and DevOps workflows: