No credit card required

Website Content Crawler

apify/website-content-crawler

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo

Back to issues Create new issue

Cannot pass Basic authentication.

Closed

t-toda opened this issue

I cannot pass Basic authentication. I tried setting the headers in startUrls, but the result is a 401 error.

When I make a request from the terminal using curl with the same settings in -H, I can get the result. Why is this happening?

Jindřich Bär (jindrich.bar)

Hello and thank you for your interest in this Actor!

Could you please share the link to the run where this happened? Or (if it e.g. contains sensitive information) at least the Run ID?

Without a reproduction scenario, we cannot provide much support.

I'll be looking forward to your response. Cheers!

Jindřich Bär (jindrich.bar)

Hello - just letting you know that we've just released a new version of Website Content Crawler (0.3.37) where the headers passing is fixed - the Actor now correctly processes the passed HTTP headers, methods, and payloads.

I'll close this issue now, but feel free to ping us in case of any other issues or questions.

Cheers!

Add comment

Developer

Apify

Actor metrics

3.8k monthly users
544 stars
99.9% runs succeeded
3.4 days response time
Created in Mar 2023
Modified 1 day ago

Categories

Developer tools

Business

AI Website Content Markdown Scraper

quaking_pail/ai-website-content-markdown-scraper

This Apify Actor, "Website Content Crawler with Markdown Extraction," is designed to perform a comprehensive crawl of specified websites, extract their text content, convert it into Markdown format, and store it in a structured dataset. The extracted content is suitable for feeding LLMs.

AI_Builder

138

RAG Web Browser

apify/rag-web-browser

Web browser for OpenAI Assistants API and RAG pipelines, similar to a web browser in ChatGPT. It queries Google Search, scrapes the top N pages from the results, and returns their cleaned content as Markdown for further processing by an LLM.

Apify

124

Google Maps Scraper

compass/crawler-google-places

Extract data from hundreds of Google Maps locations and businesses. Get Google Maps data including reviews, images, contact info, opening hours, location, popular times, prices & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

76.6k

492

📩📍 Google Maps Email Extractor

lukaskrivka/google-maps-with-contact-details

Extract Google Maps contact details. Scrape websites of Google Maps places for contact details and get email addresses, website, location, address, zipcode, phone number, social media links. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Lukáš Křivka

6.6k

116

Amazon Reviews Scraper

junglee/amazon-reviews-scraper

Amazon scraper to extract reviews from Amazon products. Scrape and download detailed reviews without using the Amazon API, including rating score, review description, reactions and images. Download your data as HTML table, JSON, CSV, Excel, XML.

Junglee

2.9k

Google Maps Reviews Scraper

compass/Google-Maps-Reviews-Scraper

Extract all reviews of Google Maps places using place URLs. Get review text, published date, response from owner, review URL, and reviewer's details. Download scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Compass

3.6k

Tiktok Hashtag Scraper

clockworks/tiktok-hashtag-scraper

Scrape TikTok hashtag data. Just add one or more hashtags and extract TikTok videos with that hashtag: URLs, likes, country of creation, video and music metadata, TikTok creator data. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Clockworks

2.1k

TikTok Sound Scraper

clockworks/tiktok-sound-scraper

Scrape TikTok videos with a chosen sound. Just add one or more sound URLs and extract tiktoks that have it: URLs, likes, country of creation, video and music metadata, creator data. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Clockworks

179

Google Maps Extractor

compass/google-maps-extractor

Extract data from hundreds of places fast. Scrape Google Maps by keyword, category, location, URLs & other filters. Get addresses, contact info, opening hours, popular times, prices, menus & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

13.4k

235