Pricing

Pay per usage

Go to Store

Website Backup

Try for free

Developed by

Matej Hamas

Enables to create a backup of any website by crawling it, so that you don’t lose any content by accident. Ideal e.g. for your personal or company blog.

0.0 (0)

Pricing

Pay per usage

Last modified

a month ago

Automation

Open source

Start URLs

startURLsarrayOptional

List of URL entry points. Each entry is an object of type {'url': 'http://www.example.com'}

Link selector

linkSelectorstringOptional

CSS selector matching elements with 'href' attributes that should be enqueued. To enqueue urls from

Max pages per run

maxRequestsPerCrawlintegerOptional

The maximum number of pages that the scraper will load. The scraper will stop when this limit is reached. It's always a good idea to set this limit in order to prevent excess platform usage for misconfigured scrapers. Note that the actual number of pages loaded might be slightly higher than this value.

If set to 0, there is no limit.

Default value of this property is 10

Max crawling depth

maxCrawlingDepthintegerOptional

Defines how many links away from the StartURLs will the scraper descend. 0 means unlimited.

Default value of this property is 0

Max concurrency

maxConcurrencyintegerOptional

Defines how many pages can be processed by the scraper in parallel. The scraper automatically increases and decreases concurrency based on available system resources. Use this option to set a hard limit.

Default value of this property is 50

Custom key value store

customKeyValueStorestringOptional

Use custom named key value store for saving results. If the key value store with this name doesn't yet exist, it's created. The snapshots of the pages will be saved in the key value store.

Default value of this property is ""

Custom dataset

customDatasetstringOptional

Use custom named dataset for saving metadata. If the dataset with this name doesn't yet exist, it's created. The metadata about the snapshots of the pages will be saves in the dataset.

Default value of this property is ""

Timeout (in seconds) for backuping a single URL.

timeoutForSingleUrlInSecondsintegerOptional

Timeout in seconds for doing a backup of a single URL. Try to increase this timeout in case you see an error Error: handlePageFunction timed out after X seconds. .

Default value of this property is 120

Timeout (in seconds) in which page navigation needs to finish

navigationTimeoutInSecondsintegerOptional

Timeout in seconds in which the navigation needs to finish. Try to increase this if you see an error Navigation timeout of XXX ms exceeded

Default value of this property is 120

URL search parameters to ignore

searchParamsToIgnorearrayOptional

Names of URL search parameters (such as 'source', 'sourceid', etc.) that should be ignored in the URLs when crawling.

Default value of this property is []

Only consider pages under the same domain as one of the provided URLs.

sameOriginbooleanOptional

Only backup URLs with the same origin as any of the start URL origins. E.g. when turned on for a single start URL https://blog.apify.com, only links with prefix https://blog.apify.com will be backed up recursively.

Default value of this property is true

Proxy configuration

proxyConfigurationobjectOptional

Choose to use no proxy, Apify Proxy, or provide custom proxy URLs.

Default value of this property is {}

Website extract

mrahil/my-actor

It is website extractor

Mohammed Rahil

Extract Emails from any website

openai/extract-emails-from-any-website

Extract email addresses from any website. Whether you're scraping a single company website or automating bulk email collection across thousands of URLs, this actor ensures high accuracy and scalability.

Scraplib

Extract Website With URL

mrahil/extract-website-with-url

The Extract Website with URL API allows users to extract structured data from any webpage by providing a URL. It retrieves HTML, metadata, tables, and images, returning data in JSON format. Ideal for web scraping, SEO analysis, and content extraction. Use it for e-commerce data, news scraping

Mohammed Rahil

Sitemap Sniffer

vaclavrut/sitemap-sniffer

Sitemap sniffer will check the most used variants of sitemaps and you can use that for crawling. This will just save you time so you don't have to check manually.

Vaclav Rut

630

5.0

Get Countries Info By Code Scraper

dev_bodex/get-countries-info-by-code-scraper

This scraper is designed to retrieve detailed information about countries by their respective ISO country codes (e.g., "US" for the United States) or by their currency codes (e.g., "USD" for US Dollar).

Festus Befgrp

Extract-any-webpage-content-for-llm

ai-developer/extract-any-webpage-content-for-llm

Fast and easy way to extract data from any webpage and are LLM friendly. The tool lets you easily extract content from any website. Ideal for researchers, marketers, and developers.

aideveloper

484

RequestList Bridge

pocesar/request-list-bridge

Allows you to filter (thus cleaning up your list) and append new data to those requests before sending to your target task. Also enables a workaround to provide requestsFromUrl to existing actors that don't support it natively.

Paulo Cesar

Extract Contact Details from Any Website – Email, Phone, Social

creative_tablecloth/extract-email-phone-social-media-from-any-website

Discover our powerful scraper that effortlessly extracts emails, phone numbers, and social media links from any website. Ideal for marketers and businesses seeking to enhance their contact database quickly and efficiently.

Jinny Kim

1.7K

3.0

🔥fireSummarize AI Summarize any Website Content

mohamedgb00714/fireScraper-AI-sammarize-Website-Content

fireSummarize is an AI-powered tool that scrapes any website using Crawlee and Puppeteer, extracts and converts content into Markdown, and then summarizes it using a custom prompt — perfect for generating clean, structured insights from any webpage.

mohamed el hadi msaid

5.0

LinkedIn Company Profile Scraper

pratikdani/linkedin-company-profile-scraper

This LinkedIn company extractor program enables you to gather comprehensive information on multiple companies in bulk. It provides details such as company name, address, phone numbers, website, employee count, and more.