Sitemap Change Detector avatar
Sitemap Change Detector

Pricing

Pay per usage

Go to Store
Sitemap Change Detector

Sitemap Change Detector

Developed by

Tri⟁angle

Tri⟁angle

Maintained by Apify

Identify and monitor sitemaps for specified websites. Retrieve only the new, updated, or removed URLs since the last crawl.

0.0 (0)

Pricing

Pay per usage

2

Total users

13

Monthly users

13

Runs succeeded

89%

Last modified

24 days ago

Start URLs

startUrlsarrayRequired

List of start URLs to scrape. These can be direct sitemap urls or website on which the sitemaps are going to be found if the discoverSitemaps is enabled.

Discover sitemaps

discoverSitemapsbooleanOptional

If enabled, the actor will fetch each start URL's robots.txt and enqueue any sitemap URL it finds. This is useful if you don't want to enter direct sitemap URLs. Please note that this will only work if the website has robots.txt.

Default value of this property is true

Change types

changeTypesarrayOptional

Maximum number of URLs the Website Content Crawler will process in each run. Increasing this decreases the total number of runs needed to process all URLs from the Sitemap Change Detector. Note that the run datasets will be merged and output afterwards.

Default value of this property is ["NEW","UPDATED"]

Snapshot key prefix

snapshotKeyPrefixstringOptional

Prefix for the snapshot record key stored in the snapshots key-value store, to separate runs by website or project.

Default value of this property is "DEFAULT"

URL filter regex

urlFilterRegexstringOptional

Regex pattern to filter which URLs are included in the output and snapshot. This filter applies only to the final URLs and not to intermediate sitemap URLs.

Add removed URLs to key-value store

addRemovedUrlsToKvsbooleanOptional

If enabled, the actor will always also include URLs that were removed compared to the previous snapshot to the key-value store.

Default value of this property is false

Proxy configuration

proxyConfigurationobjectOptional

Proxy configuration used for crawling.

Default value of this property is {"useApifyProxy":true}