
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
4.6 (38)
Pricing
Pay per usage
1.1k
Monthly users
5.9k
Runs succeeded
>99%
Response time
2.3 days
Last modified
6 days ago
Page Title
Our company wants to know if there's a way to ask for a modification to this actor so that it could get the page's HTML "page title" for each URL / page that it crawls and provide this as a new field to the existing fields (ex: url, text etc..) Example, for a URL like this: https://www.tps.org/board_of_education We would like the actor to include the page's title as a new field in the available fields to download (ex: "Board of Education - Toledo Public Schools")

Hi, thank you for using Website Content Crawler!
I'm sorry, but I don't quite understand your feature request.
When Website Content Crawler scrapes a URL, it saves detailed information, including the page title. You can see an example here: Run.
The page title is stored in metadata.title
.
If you meant something else, could you provide an example run and specify what exactly is missing in the output?
Thank you, Jiri
CtrlAltElite
Hi Jiri, The specific request is to add "title" to the available fields to download when choosing "export data set". See attached image An example run would be this: https://console.apify.com/organization/3vNBAWdW4tPWMWAre/actors/runs/jHxCHJiLQS1CNhWvd#output And then "Export Result" for Data set "rz64spECyO0DiArCW"

Hi,
Since the metadata
is an object that has the title
field, the title itself is not visible, but is contained within the metadata
. When you actually download the CSV export file, you can access the page title in the metadata/title
column.
Let me know if that solves your issue.
Jakub
Pricing
Pricing model
Pay per usageThis Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.