data:image/s3,"s3://crabby-images/e09f3/e09f33c5b1972a00d590e13bbbce1aa2367cfe3d" alt="Web Scraper avatar"
Web Scraper
No credit card required
data:image/s3,"s3://crabby-images/e09f3/e09f33c5b1972a00d590e13bbbce1aa2367cfe3d" alt="Web Scraper"
Web Scraper
No credit card required
Crawls arbitrary websites using the Chrome browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.
Am I doing something wrong?
I have been trying your solution for a long time. I try it again and again. I want to scrape a web page. But I only get 2 entries as output. Why, what do I have to do? Please help me.
Hello, and thank you for your interest in this Actor!
This behavior is caused by using the Glob Patterns
input option - by setting it to https://www.management-qualifizierung.de/,
you're telling the Actor only to crawl this one specific URL (the example.com
is added in the default Page function as an example.)
You can set the Glob Patterns
option to https://www.management-qualifizierung.de/**/*
to tell the Actor to crawl all the pages under this domain. See my "fixed" run here. Note that in this run, I set the maxResultsPerCrawl
option to 100
, to crawl more pages (in your run, this was set to 10). Adjust this value if you want to get more / less results.
I'll close this issue now, but feel free to ask additional questions if you have any. Cheers!
Actor Metrics
3.3k monthly users
-
456 bookmarks
>99% runs succeeded
4.8 days response time
Created in Mar 2019
Modified a month ago