Get started
Product
Back
Start here!
Get data with ready-made web scrapers for popular websites
Browse 4,500+ Actors
Apify platform
Apify Store
Pre-built web scraping tools
Actors
Build and run serverless programs
Integrations
Connect with apps and services
AI agents
Equip your AI agents with Actors
Anti-blocking
Scrape without getting blocked
Proxy
Rotate scraper IP addresses
Open source
Crawlee
Web scraping and crawling library
Solutions
Web data for
Enterprise
Startups
Universities
Nonprofits
Use cases
Data for generative AI
Lead generation
Market research
Sentiment analysis
View more →
Consulting
Apify Professional Services
Apify Partners
Developers
Documentation
Full reference for the Apify platform
Code templates
Python, JavaScript, and TypeScript
Web scraping academy
Courses for beginners and experts
Deploy to Apify
With CLI or GitHub integration
Monetize your code
Publish your scrapers and get paid
Learn
API reference
CLI
SDK
Apify open source fair share
We will support and reward every open-source project on Apify Store
Join now
Resources
Help and support
Advice and answers about Apify
Submit your ideas
Tell us the Actors you want
Changelog
See what’s new on Apify
Customer stories
Find out how others use Apify
Company
About Apify
Contact us
Blog
Affiliate Program
Jobs
We're hiring!
Join our Discord
Talk to scraping experts
Pricing
Contact sales
4
STATUS
Open to develop
CATEGORIES
Developer tools
SUBMITTED
Sep 7, 2021
Easily scrape through Tor's anonymized content and download data as an HTML table, JSON, CSV, Excel, XML.
apify/website-content-crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
Apify
46.9k
apify/cheerio-scraper
Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.
7.3k
apify/web-scraper
Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.
81.4k
apify/puppeteer-scraper
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
6.4k
topaz_sharingan/Youtube-Transcript-Scraper
Are you in search of a robust solution for extracting transcripts from YouTube videos? Look no further 😉, YouTube-Transcript-Scraper will meet your needs. Our software not only efficiently retrieves transcripts but also provides additional valuable information .👍 😀 Scrap away 🕵♂️.
Moses Bilal
1.7k
apify/legacy-phantomjs-crawler
Replacement for the legacy Apify Crawler product with a backward-compatible interface. The actor uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of front-end JavaScript code.
1.6k
muhammetakkurtt/truth-social-scraper
Extract Truth Social profile posts with this professional Apify actor tool. Collect content from Donald Trump and key profiles. Analyze interactions, media and replies with real-time data. Ideal for political monitoring, market research and trend analysis. API integration for real-time data flow.
Muhammet Akkurt
320
apify/playwright-scraper
Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.
1.3k
danpoletaev/product-hunt-scraper
Scrape product hunt "Top Products Launching Today" section. Actor crawls products and extracts information about the product: title, description, categories, images, maker info with contact links and website info with raw text and email. Export scraped datasets in JSON, csv, etc. Run via API.
Danil Poletaev
341
circ_le/gcp-uploader
Upload datasets and KV store records to Google cloud storage
Cir◎cle
14
Browse our Store and find the right solution for you