
Web Scraper
Pricing
Pay per usage

Web Scraper
Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.
4.5 (22)
Pricing
Pay per usage
715
Total users
83k
Monthly users
4.1k
Runs succeeded
>99%
Issue response
38 days
Last modified
23 days ago
how do i associate which output is for which URL when doing bulk crawling
Closed
how do i associate which output is for which URL when doing bulk crawling. I want to be able to map the results of the crawl to the source URL. How do I do that?
Hello and thank you for your interest in this Actor!
I'm assuming you've found the solution because you've already closed this issue. If that was a mistake (or you're still looking for the "official" answer to your question), here you go:
The page function can return a JS object with multiple fields. The current page's URL is stored in context.request.url
and can be accessed from there. The following snippet stores the page URL and its content into one dataset record, so you can map the content to its URL.
async function pageFunction(context) {const $ = context.jQuery;const content = $('body').first().text();return {url: context.request.url,content,};};
Does this answer your question? Let us know if you have any more questions for us!