Website Backup
No credit card required
Website Backup
No credit card required
Enables to create a backup of any website by crawling it, so that you don’t lose any content by accident. Ideal e.g. for your personal or company blog.
Start URLs
startURLs
arrayOptional
List of URL entry points. Each entry is an object of type {'url': 'http://www.example.com'}
Link selector
linkSelector
stringOptional
CSS selector matching elements with 'href' attributes that should be enqueued. To enqueue urls from
Max pages per run
maxRequestsPerCrawl
integerOptional
The maximum number of pages that the scraper will load. The scraper will stop when this limit is reached. It's always a good idea to set this limit in order to prevent excess platform usage for misconfigured scrapers. Note that the actual number of pages loaded might be slightly higher than this value.
If set to 0
, there is no limit.
Default value of this property is 10
Max crawling depth
maxCrawlingDepth
integerOptional
Defines how many links away from the StartURLs will the scraper descend. 0 means unlimited.
Default value of this property is 0
Max concurrency
maxConcurrency
integerOptional
Defines how many pages can be processed by the scraper in parallel. The scraper automatically increases and decreases concurrency based on available system resources. Use this option to set a hard limit.
Default value of this property is 50
Custom key value store
customKeyValueStore
stringOptional
Use custom named key value store for saving results. If the key value store with this name doesn't yet exist, it's created. The snapshots of the pages will be saved in the key value store.
Default value of this property is ""
Custom dataset
customDataset
stringOptional
Use custom named dataset for saving metadata. If the dataset with this name doesn't yet exist, it's created. The metadata about the snapshots of the pages will be saves in the dataset.
Default value of this property is ""
Timeout (in seconds) for backuping a single URL.
timeoutForSingleUrlInSeconds
integerOptional
Timeout in seconds for doing a backup of a single URL. Try to increase this timeout in case you see an error Error: handlePageFunction timed out after X seconds.
.
Default value of this property is 120
Timeout (in seconds) in which page navigation needs to finish
navigationTimeoutInSeconds
integerOptional
Timeout in seconds in which the navigation needs to finish. Try to increase this if you see an error Navigation timeout of XXX ms exceeded
Default value of this property is 120
URL search parameters to ignore
searchParamsToIgnore
arrayOptional
Names of URL search parameters (such as 'source', 'sourceid', etc.) that should be ignored in the URLs when crawling.
Default value of this property is []
Only consider pages under the same domain as one of the provided URLs.
sameOrigin
booleanOptional
Only backup URLs with the same origin as any of the start URL origins. E.g. when turned on for a single start URL https://blog.apify.com
, only links with prefix https://blog.apify.com
will be backed up recursively.
Default value of this property is true
Actor Metrics
5 monthly users
-
4 stars
>99% runs succeeded
Created in Jul 2020
Modified 4 years ago