Check any website you plan to scrape for expected Compute unit consumption, anti-scraping software, and reliability.
Google Sheets Import & Export
Import data from datasets or JSON files to Google Sheets. Programmatically process data in Sheets. Easier and faster than the official Google Sheets API and perfect for importing data from scraping.
Smart Article Extractor
📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.
Use our free website keyword extractor to crawl any website and extract keyword counts on each page.
Image Downloader & Uploader
Download image files from image URLs in your datasets or key-value stores and save them to our key-value store or your AWS S3 bucket.
Example Image Download
Download a single image from a URL and store it into a key-value store.
Check your dataset for duplications. Accept only the highest quality data!
Merge, Dedup & Transform Datasets
The ultimate dataset processor. Extremely fast merging, deduplications & transformations all in a single run.
Foursquare Reviews Scraper
Scrape a massive number of reviews in a few seconds for any number of locations and categories.
Free trial · $1/month
Website Checker Runner Puppeteer
Checks the provided website using Puppeteer. This is a low level runner, most likely you want to use the high level master actor - https://apify.com/lukaskrivka/website-checker
Website Checker Runner Cheerio
Checks the provided website using cheerio. This is a low level runner, most likely you want to use the high level master actor - https://apify.com/lukaskrivka/website-checker
Find IPs from Proxy Groups
Simple actor to list IPs that you have allocated in any of your proxy groups. You have to specify the total count of the IPs you have in the groups you want to test for this to work properly.
Speed of light scraping with Rust programming language! This is an early alpha version for experimenting, use at your own risk!
Check the results of your scrapers with this flexible checker. Just supply a dataset or key-value store ID and a few simple rules to get a detailed report.
Actor Compute Units Aggregator
Aggregates daily or monthly usage of compute units for all your actors. Please don't use this if you have thousands of daily runs as it will overload the Apify API.
Website Checker Runner Playwright
Checks the provided website using Playwright. This is a low level runner, most likely you want to use the high level master actor - https://apify.com/lukaskrivka/website-checker
Log scanner helps you to find particular text in your log files. It can scan [Apify](https://apify.com/) runs, tasks, or actors but also arbitrary text files. If you ever had a problem finding that one error in a thousand runs, this is a tool to use.
Website Checker Workload
Creates reasonable workloads for analyzing any website with the Website Checker actor and combines the resulting data. This is the easiest way to analyze any website for compute unit usage and anti-scraping blocking.
Rebirth failed requests
Rebirth failed requests of past runs into a pristine state with no retries so you can rescrape them by resurrecting the run.
Example Public Actor Input
Example of input schema for public scraping actors
Free Large Video Converter
Flexible and powerful conversion tool using the popular ffmpeg program ideal for very large video and audio files. Convert any audio or video file to a different format and adjust any settings. Automatically recognizes the source format.
Free trial · $1/month
Generate People Pairs
Creates pairs from a list of people and posts them to Slack. Keeps track of history and always creates new and shuffled pairs.
Github Issues Tracker
Collect all open issues from GitHub repositories across multiple accounts and upload them to your spreadsheet. Make teams more productive with efficient issue tracking and never miss one.
Resurrect run on Out of memory
Simple helper actor to resurrect your runs when they run out of memory.
Sort Dataset Items
Add this actor as a webhook to your scraper to sort the dataset by index field