Sitemap Xml Monitor
Pricing
from $0.005 / sitemap compare
Sitemap Xml Monitor
Monitor sitemap.xml files for structural, availability, and content changes. Detect critical SEO issues like URL removals, broken sitemaps, index changes, and formatting errors with severity-based alerts.
Pricing
from $0.005 / sitemap compare
Rating
0.0
(0)
Developer
DatawinderLabs
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share
Sitemap.xml Monitor
Stateful sitemap.xml monitoring Actor with baseline awareness, diff-based detection, and severity-classified alerts.
This Actor is designed for monitoring, not validation or SEO auditing.
It reports only meaningful changes over time and avoids noisy false positives.
This Actor is stateful. Alerts are emitted only after a baseline snapshot exists (from the second run onward).
Snapshot Contract
This Actor uses a versioned, stable snapshot schema.
- Snapshot version: v1
- Schema changes require explicit migration
- Downstream consumers may rely on field names and severity semantics
What this Actor monitors
- sitemap.xml availability (HTTP reachability)
- Sitemap type changes (index vs urlset)
- Large-scale URL removals (mass deletion protection)
- New URL additions
- Metadata changes (lastmod regressions, priority updates)
- Formatting-only edits (comments / whitespace)
The Actor stores a baseline snapshot on first run and compares all subsequent runs against it.
Alert Semantics (Severity Contract)
This Actor follows a strict severity contract.
Each severity level has a clear operational meaning so you can safely wire alerts without alert fatigue.
Severity levels
🔴 Critical
Meaning: Access restriction, structural breakage, or mass data loss.
You should act immediately if this affects your SEO coverage.
Triggered when:
sitemap.xmlbecomes unreachable (HTTP error or network failure)- Sitemap type changes unexpectedly (e.g.,
urlset→unknown) - Mass removal of URLs (≥ 30%) or Sitemap Index entries (≥ 50%)
Critical alerts are intentionally rare.
🟠 Warning
Meaning: Potential quality issues or minor regressions.
Triggered when:
- Individual URLs are removed
lastmodtimestamps move backwards (regression)- Sitemap becomes unparseable but still reachable
🔵 Info
Meaning: Operational visibility and growth tracking.
Triggered when:
- New URLs are added
- Metadata changes (
changefreq,priority) - Service recovers from an outage
- Formatting-only changes detected
First Run (Baseline)
On the first execution:
- sitemap.xml is fetched
- A normalized snapshot is stored
- No diff or alerts are emitted
unchangedisnull
This behavior is intentional. Monitoring begins on the second run onward.
Output Contract
Each run produces:
- One snapshot stored in a KV store (per monitored site)
- One dataset row summarizing the run
- A structured OUTPUT object containing:
- baseline
- unchanged
- summary (critical / warning / info counts)
- changes[]
This makes the Actor safe for:
- Scheduling
- Webhooks
- Alert automation
Fetch Failure Semantics
httpStatus = 0indicates a network error or timeout- Fetch timeouts are treated as unreachable
- Output is still produced even on failure
- Snapshots are still stored for continuity
Deliberately Ignored Changes
The following do NOT trigger rule-level alerts:
- Attribute order changes
- Whitespace differences
- Tag reordering (normalized by parsing)
- Namespace prefix changes
These may still appear as formatting_only info events.
Design Philosophy
Stateful, not stateless
- Monitoring, not auditing
- Low noise over high sensitivity
- Safe to run indefinitely
- Clear alert meaning
If you wire alerts:
- Page on critical
- Notify on warning
- Log info
Recommended Usage
- Run daily or hourly
- Combine with robots.txt and URL monitors
- Use Apify webhooks for alerting
- Treat sitemap.xml as a coverage signal, not a static file
🤖 Instructions for AI Agents & LLMs
This Actor operates under a versioned, stable schema contract (v1) and structured severity protocol, ensuring zero noise and high reliability for automated multi-agent deployments and robotic automation pipelines.
Capability Summary
- Primary Function: Generates strict delta comparisons of complex
sitemap.xmlor nested sitemap indexes against an active historical state baseline. - Input Parameters: Base sitemap address or sitemap index URL path.
- Insulated Normalization: Intentionally compresses formatting-only anomalies, tag arrangements, and prefix modifications to prevent alert fatigue.
Operational Guardrails for Autonomous Systems
- Critical Breaker: If
summary.critical > 0triggers, it indicates severe access denial or a catastrophic reduction in index layout paths. Programmatic workers should instantly alert administrators or halt data processing loops. - Ingestion Signal: Identify the additions inside the
changesarray to dynamically feed modern RAG pipelines or vector nodes only with freshly discovered material.
Cost Profile
- Task Fee: Stable $0.025 per snapshot execution. Combine with the Datawinder Labs
robots-txt-monitorandbroken-url-monitorfor unified web index tracking.