Sitemap Xml Monitor avatar

Sitemap Xml Monitor

Pricing

from $0.005 / sitemap compare

Go to Apify Store
Sitemap Xml Monitor

Sitemap Xml Monitor

Monitor sitemap.xml files for structural, availability, and content changes. Detect critical SEO issues like URL removals, broken sitemaps, index changes, and formatting errors with severity-based alerts.

Pricing

from $0.005 / sitemap compare

Rating

0.0

(0)

Developer

Datawinder

Datawinder

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 days ago

Last modified

Share

Sitemap.xml Monitor

Stateful sitemap.xml monitoring Actor with baseline awareness, diff-based detection, and severity-classified alerts.

This Actor is designed for monitoring, not validation or SEO auditing.
It reports only meaningful changes over time and avoids noisy false positives.

This Actor is stateful. Alerts are emitted only after a baseline snapshot exists (from the second run onward).


Snapshot Contract

This Actor uses a versioned, stable snapshot schema.

  • Snapshot version: v1
  • Schema changes require explicit migration
  • Downstream consumers may rely on field names and severity semantics

What this Actor monitors

  • sitemap.xml availability (HTTP reachability)
  • Sitemap type changes (index vs urlset)
  • Large-scale URL removals (mass deletion protection)
  • New URL additions
  • Metadata changes (lastmod regressions, priority updates)
  • Formatting-only edits (comments / whitespace)

The Actor stores a baseline snapshot on first run and compares all subsequent runs against it.


Alert Semantics (Severity Contract)

This Actor follows a strict severity contract.

Each severity level has a clear operational meaning so you can safely wire alerts without alert fatigue.

Severity levels

🔴 Critical

Meaning: Access restriction, structural breakage, or mass data loss.

You should act immediately if this affects your SEO coverage.

Triggered when:

  • sitemap.xml becomes unreachable (HTTP error or network failure)
  • Sitemap type changes unexpectedly (e.g., urlsetunknown)
  • Mass removal of URLs (≥ 30%) or Sitemap Index entries (≥ 50%)

Critical alerts are intentionally rare.


🟠 Warning

Meaning: Potential quality issues or minor regressions.

Triggered when:

  • Individual URLs are removed
  • lastmod timestamps move backwards (regression)
  • Sitemap becomes unparseable but still reachable

🔵 Info

Meaning: Operational visibility and growth tracking.

Triggered when:

  • New URLs are added
  • Metadata changes (changefreq, priority)
  • Service recovers from an outage
  • Formatting-only changes detected

First Run (Baseline)

On the first execution:

  • sitemap.xml is fetched
  • A normalized snapshot is stored
  • No diff or alerts are emitted
  • unchanged is null

This behavior is intentional. Monitoring begins on the second run onward.


Output Contract

Each run produces:

  • One snapshot stored in a KV store (per monitored site)
  • One dataset row summarizing the run
  • A structured OUTPUT object containing:
    • baseline
    • unchanged
    • summary (critical / warning / info counts)
    • changes[]

This makes the Actor safe for:

  • Scheduling
  • Webhooks
  • Alert automation

Fetch Failure Semantics

  • httpStatus = 0 indicates a network error or timeout
  • Fetch timeouts are treated as unreachable
  • Output is still produced even on failure
  • Snapshots are still stored for continuity

Deliberately Ignored Changes

The following do NOT trigger rule-level alerts:

  • Attribute order changes
  • Whitespace differences
  • Tag reordering (normalized by parsing)
  • Namespace prefix changes

These may still appear as formatting_only info events.


Design Philosophy

Stateful, not stateless

  • Monitoring, not auditing
  • Low noise over high sensitivity
  • Safe to run indefinitely
  • Clear alert meaning

If you wire alerts:

  • Page on critical
  • Notify on warning
  • Log info

  • Run daily or hourly
  • Combine with robots.txt and URL monitors
  • Use Apify webhooks for alerting
  • Treat sitemap.xml as a coverage signal, not a static file