Get started
Product
Back
Start here!
Get data with ready-made web scrapers for popular websites
Browse 8,000+ Actors
Apify platform
Apify Store
Pre-built web scraping tools
Actors
Build and run serverless programs
Integrations
Connect with apps and services
Anti-blocking
Scrape without getting blocked
Proxy
Rotate scraper IP addresses
Open source
Crawlee
Web scraping and crawling library
Solutions
MCP server configuration
Configure your Apify MCP server with Actors and tools for seamless integration with MCP clients.
Start building
Web data for
Enterprise
Startups
Universities
Nonprofits
Use cases
Data for generative AI
Data for AI agents
Lead generation
Market research
View more →
Consulting
Apify Professional Services
Apify Partners
Developers
Documentation
Full reference for the Apify platform
Code templates
Python, JavaScript, and TypeScript
Web scraping academy
Courses for beginners and experts
Monetize your code
Publish your scrapers and get paid
Learn
API reference
CLI
SDK
Publish tools on Apify and win big prizes
Join the challenge
Resources
Help and support
Advice and answers about Apify
Actor ideas
Get inspired to build Actors
Changelog
See what’s new on Apify
Customer stories
Find out how others use Apify
Company
About Apify
Contact us
Blog
Live events
Partners
Jobs
We're hiring!
Join our Discord
Talk to scraping experts
Pricing
Contact sales
Pay per usage
hamza.alwan/website-content-vector-retriever
Rating
0.0
(0)
Developer
Hamza Alwan
Actor stats
5
Bookmarked
21
Total users
1
Monthly active users
2 years ago
Last modified
Categories
AI
Share
6sigmag/fast-website-content-crawler
A high-performance web scraper that rapidly extracts and analyzes content from multiple websites simultaneously. Perfect for competitive research, content aggregation, and website structure analysis.
David Deng
2.4K
4.3
mrahil/my-actor
It is website extractor
Mohammed Rahil
128
apify/website-content-crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
Apify
86K
4.6
datascoutapi/website-content-crawler-pro
Crawl websites and extract clean, structured content in Markdown, JSON, or plain text for AI models, LLMs, vector DBs, or RAG pipelines. Fast, reliable, and stealthy, with bulk processing, advanced metadata extraction, and seamless integration with LangChain, LlamaIndex, and AI workflows.
halam
107
quaking_pail/ai-website-content-markdown-scraper
This Apify Actor, "Website Content Crawler with Markdown Extraction," is designed to perform a comprehensive crawl of specified websites, extract their text content, convert it into Markdown format, and store it in a structured dataset. The extracted content is suitable for feeding LLMs.
AI_Builder
798
3.9
akash9078/website-content-crawler
Powerful website content crawler tool to extract, analyze, and index web pages automatically. Streamline data collection with fast, accurate web scraping technology.
Akash Kumar Naik
17
mrahil/extract-website-with-url
The Extract Website with URL API allows users to extract structured data from any webpage by providing a URL. It retrieves HTML, metadata, tables, and images, returning data in JSON format. Ideal for web scraping, SEO analysis, and content extraction. Use it for e-commerce data, news scraping
170
assertive_analogy/advanced-crawler
A fast, Python-powered web crawler with smart content extraction, JS support, metadata capture, and duplicate detection. Ideal for SEO, content migration, and e-commerce scraping. Reliable, scalable, and easy to customize.
Gideon Nesh
19
1.0
husseinbuilds/instagram-content-intelligence-pro
Extract complete data from Instagram Reels — captions, hashtags, views, likes, comments, author info, and AI-generated transcripts. Supports bulk links, no login needed, fast and reliable for creators, analysts, and automation workflows.
Hussein Sbeiti
4
muhammadsaifkhalid4/my-actor
You can scrape Webpages for data. What changed? Multiple URLs Error handling: Each URL is handled independently, failures are logged & stored. Anti-blocking: Added User-Agent + Accept-Language. Data structure: Instead of just a flat heading list, you now get per-URL results with metadata.
Saif Khalid
57
3.5