No credit card required

Website Content Crawler

apify/website-content-crawler

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo

Back to issues Create new issue

Can't crawl while logged in

Closed

rust_chimta opened this issue

I can't seem to successfully pass initial cookies to the actor and reliably use a logged in session. I believe what is happening is new sessions without cookies are being created and the crawler is failing to scrape content that is only visible while logged in.

Oscar Rodriguez (Oscardz)

Hello, Thank you for trying the Website content crawler. I tried it, and it seems to work with my cookies. I suspect this can be related to the expiration of the cookies. Can you try it again?

Jiří Spilka (jiri.spilka)

I’ll go ahead and close this issue now, but please feel free to ask additional questions or raise a new issue.

Add comment

Developer

Apify

Actor Metrics

3.9k monthly users
713 stars
>99% runs succeeded
2.2 days response time
Created in Mar 2023
Modified 14 hours ago

Categories

Developer tools

Fast Website Content Crawler

6sigmag/fast-website-content-crawler

A high-performance web scraper that rapidly extracts and analyzes content from multiple websites simultaneously. Perfect for competitive research, content aggregation, and website structure analysis.

David Deng

Deep Website Content Crawler

6sigmag/deep-website-content-crawler

Scrape Failed Killer! A high-performance web scraper that rapidly extracts and analyzes content from multiple websites simultaneously. Perfect for competitive research, content aggregation, and website structure analysis.

David Deng

AI Website Content Markdown Scraper

quaking_pail/ai-website-content-markdown-scraper

This Apify Actor, "Website Content Crawler with Markdown Extraction," is designed to perform a comprehensive crawl of specified websites, extract their text content, convert it into Markdown format, and store it in a structured dataset. The extracted content is suitable for feeding LLMs.

AI_Builder

217

Example Website Screenshot Crawler

dz_omar/example-website-screenshot-crawler

Automated website screenshot crawler using Pyppeteer and Apify. This open-source actor captures screenshots from specified URLs, uploads them to the Apify Key-Value Store, and provides easy access to the results, making it ideal for monitoring website changes and archiving web content.

Omar Abdlhakim

Instagram Scraper

apify/instagram-scraper

Scrape and download Instagram posts, profiles, places, hashtags, photos, and comments. Get data from Instagram using one or more Instagram URLs or search queries. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Apify

62.1k

597

Google Maps Reviews Scraper

compass/Google-Maps-Reviews-Scraper

Extract all reviews of Google Maps places using place URLs. Get review text, published date, response from owner, review URL, and reviewer's details. Download scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Compass

4.7k

Airbnb Scraper

tri_angle/airbnb-scraper

Scrape Airbnb rentals in your chosen destinations. Extract descriptions, locations, prices per night, ratings, reviews count, host details, amenities and more. Download scraped data in various formats including HTML, JSON and Excel.

Tri⟁angle

7.9k

Linkedin Profile Scraper - People & Company

saswave/linkedin-profile

Scrape linkedin People & Company profile urls at scale. Input can be a search url too. Get Information like: connection, follower, location, all experience, education, language, about,last activities, personal contact info (firstname, lastname, email, phone, birthday, creation date, picture url ..)

SASWAVE

588

Facebook Hashtag Scraper

apify/facebook-hashtag-scraper

Extract data from hundreds of Facebook posts using one or multiple hashtags. Get post text&URL, time of posting, basic poster info, image&video URLs, OCR text, likes, comments and shares count, and more. Download the data in JSON, CSV, Excel and use it in apps, spreadsheets, and reports.

Apify

3.4k