🤖 Get data to feed your AI models, LLMs or GPTs
Product
Apify Store
Start web scraping with ready-made scrapers
Crawlee
Our reliable open-source web scraping library
Code templates
Get started with templates for your scraping project
Actors
Run serverless cloud programs on the Apify platform
Integrations
Seamlessly connect with other apps and services
Proxy
Improve your web scraping performance
Storage
Specialized cloud storage for web scraping and crawling
Apify CLI
Create, develop, build, and run Apify actors locally
Solutions
DELIVERED BY
Apify Enterprise
Certified Partners
TAILORED FOR
Paid Actor developers
USE CASES
Data for generative AI & LLM
Product matching AI
Universal web scrapers
All use cases
INSPIRATION
Success stories
Resources
Help and support
Get advice and answers about the Apify platform
Submit your ideas
Upvote or submit actor or integration ideas
LEARN
Documentation
About Apify
Blog
Web scraping course
Apify platform course
Discord
Docs
Pricing
No credit card required
Act for comparing 2 JSON arrays of objects. By default the final result set will contain only new and updated records.
apify/web-scraper
Crawls arbitrary websites using the Chrome browser and extracts data from pages using a provided JavaScript code. The actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.
Apify
52.4k
apify/website-content-crawler
Automatically crawl and extract text content from websites with documentation, knowledge bases, help centers, or blogs. This Actor is designed to provide data to feed, fine-tune, or train large language models such as ChatGPT or LLaMA.
7k
apify/cheerio-scraper
Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.
3.3k
apify/puppeteer-scraper
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
2.6k
apify/beautifulsoup-scraper
Crawls websites using raw HTTP requests. It parses the HTML with the BeautifulSoup library and extracts data from the pages using Python code. Supports both recursive crawling and lists of URLs. This Actor is a Python alternative to Cheerio Scraper.
232
apify/playwright-scraper
Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.
455
mstephen190/proxy-scraper
Free proxy scraper and checker. Search dozens of free proxy websites. Get list of 100% working public proxies in seconds. Automatically test proxies based on target URL and maximum timeout.
Matthias Stephens
2.2k
apify/screenshot-url
Create a screenshot of a website based on a specified URL. The screenshot is stored as the output in a key-value store. It can be used to monitor web changes regularly after setting up the scheduler.
2k
apify/legacy-phantomjs-crawler
Replacement for the legacy Apify Crawler product with a backward-compatible interface. The actor uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of front-end JavaScript code.
1.4k
lukaskrivka/actor-fail-manager
Automatically triggered on a failed run to analyze if the run should be resurrected and to create an error report for the author.
Lukáš Křivka
Are you a developer? Build your own Actors and run them on Apify.
Get a complete web scraping or automation solution from Apify experts.