
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
3.7 (41)
Pricing
Pay per usage
1526
Total users
59K
Monthly users
7.8K
Runs succeeded
>99%
Issues response
7.6 days
Last modified
4 days ago
How to ignore broken SSL when using PROXY
Closed
Hi, I'm currently trying to use a proxy from scrapingbee.com, but every request is not processed because there are SSL errors connecting to the proxy (test via "curl -k" works). In the scrapingbee.com manual, in the "Apify Integration" section, they recommend enabling the "Ignore SSL errors" checkbox. But I don't see it in the actor settings.

Hello, and thank you for the interest in the Actor! You are right that there is currently no way to do this with Website Content Crawler. We will look into this and let you know here once this is addressed.
sash2s
Is there a way to clone the "Website Content Crawler" docker image to add some updates to the code? We really need this feature.

Unfortunately, the package is not open source, so you cannot modify the code. We will add this, but I cannot make any promises now. You may use the Apify proxy right now - it is optimized for this use case.

I’ll go ahead and close this issue now, but please feel free to ask additional questions or raise a new issue.