Web Scraper avatar
Web Scraper
Try for free

No credit card required

View all Actors
Web Scraper

Web Scraper

apify/web-scraper
Try for free

No credit card required

Crawls arbitrary websites using the Chrome browser and extracts data from pages using a provided JavaScript code. The actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.

User avatar

Unable to scrape website

Closed

vevo_api opened this issue
2 months ago

Unable to scrape website

User avatar

Hello and thank you for your interest in this Actor!

Looking at your page function, it seems like there has been a little misunderstanding - this Actor (Web Scraper) is a "one size fits all" solution for web scraping, i.e. you give it a list of URLs and the function that extracts the data from the website and then you just run it - it handles all the browser/proxy management, navigation etc. on its own.

What you are passing to it as the "page function" seems to be a whole Node.js script opening a browser, navigating to a page and interacting with it - passing the script where the scraper expects a function definition results in the syntax error you are getting.

You can run your script as a separate Actor - this way, you can control everything yourself (which is what you are already doing in your script). Start by using the JS & Puppeteer Actor template, which comes with Puppeteer, Apify SDK, and Chrome preinstalled (so you don't need to install the dependencies yourself). All you have to do now is copy and paste your "page function" from here to the src/main.js of the newly created Actor. Then you can build and start your Actor - it should do what you expect it to do.

I'll close this issue now, but feel free to ask more questions here - be it about custom Actor development or this specific Actor here. Cheers!

Developer
Maintained by Apify
Actor metrics
  • 3.7k monthly users
  • 98.8% runs succeeded
  • 3.6 days response time
  • Created in Mar 2019
  • Modified about 1 month ago