Updated Content Checker
Pricing
from $0.10 / 1,000 results
Go to Apify Store
Updated Content Checker
Monitors sitemaps for new/updated content. Returns only URLs modified since a specified date for efficient incremental scraping.
Pricing
from $0.10 / 1,000 results
Rating
0.0
(0)
Developer

Tomáš Gabík
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Apify Actor that monitors sitemaps for new/updated content. Returns only URLs modified since a specified date, enabling efficient incremental scraping with tools like Website Content Crawler (WCC).
Features
- Parses regular sitemaps and sitemap indexes
- Relative date filter: "7 days", "2 weeks", "1 month", "24 hours"
- Absolute date filter: ISO 8601 dates like "2026-01-15"
- Optional regex pattern filtering for URLs
- Persists last check timestamp for automatic incremental checks
Input
| Field | Type | Required | Description |
|---|---|---|---|
sitemapUrl | string | Yes | URL of the sitemap.xml file |
newerThan | string | No | Relative filter: "7 days", "2 weeks", "1 month", "24 hours" |
newerThanDate | string | No | Absolute date filter (ISO 8601) |
urlPattern | string | No | Regex pattern to filter URLs |
storeLastCheck | boolean | No | Store timestamp for future runs (default: false) |
storeName | string | No | Key-value store name for persistence |
Priority: newerThan > newerThanDate > stored date
Output
Dataset
{"url": "https://example.com/article","lastModified": "2026-01-20T15:59:22Z"}
OUTPUT (Summary)
{"sitemapUrl": "https://example.com/sitemap.xml","totalUrlsInSitemap": 500,"filteredUrlCount": 500,"updatedUrlCount": 10,"cutoffDate": "2026-01-13T00:00:00Z","cutoffSource": "relative time: 7 days","checkedAt": "2026-01-20T16:00:00Z"}
Usage Examples
Get URLs updated in the last 7 days
{"sitemapUrl": "https://help.wealthsimple.com/hc/sitemap.xml","newerThan": "7 days"}
Get URLs updated since a specific date
{"sitemapUrl": "https://help.wealthsimple.com/hc/sitemap.xml","newerThanDate": "2026-01-15"}
Filter only English articles
{"sitemapUrl": "https://help.wealthsimple.com/hc/sitemap.xml","newerThan": "7 days","urlPattern": "/en-ca/articles/"}
Integration with Website Content Crawler
Use the output dataset as input for WCC to scrape only updated pages:
- Run this Actor to get updated URLs
- Pass the dataset URLs to Website Content Crawler
- Only changed content gets scraped