Fast News Scraper avatar

Fast News Scraper

Try for free

Pay $3.00 for 1,000 articles

Go to Store
Fast News Scraper

Fast News Scraper

timgreen/fast-news-scraper
Try for free

Pay $3.00 for 1,000 articles

Extract full article text and metadata from popular news sites like The New York Times, Bloomberg, Reuters, BBC, CNBC, and Wired. Scrape thousands of articles in just a few minutes. Scape a single site or provide a list of article URLs to scrape.

IN

Using specific website

Open

Intellicon opened this issue
5 months ago

An error shows whenever I try to use the "search by URL" part as a way to circumvent my earlier issue.

This is the link I am trying it to access: https://www.wired.com/

Here is my log: 2024-07-18T06:25:43.163Z ACTOR: Pulling Docker image of build d0DmsxfYtWaVJGb6I from repository. 2024-07-18T06:25:45.912Z ACTOR: Creating Docker container. 2024-07-18T06:25:46.417Z ACTOR: Starting Docker container. 2024-07-18T06:25:48.731Z INFO System info {"apifyVersion":"3.2.0","apifyClientVersion":"2.9.3","crawleeVersion":"3.9.2","osType":"Linux","nodeVersion":"v20.14.0"} 2024-07-18T06:25:49.981Z INFO Scraping provided list of articleURLs. Ignoring other inputs... 2024-07-18T06:25:51.653Z node:internal/url:797 2024-07-18T06:25:53.934Z this.#updateContext(bindingUrl.parse(input, base)); 2024-07-18T06:25:53.978Z ^ 2024-07-18T06:25:53.983Z 2024-07-18T06:25:53.986Z TypeError: Invalid URL 2024-07-18T06:25:53.992Z at new URL (node:internal/url:797:36) 2024-07-18T06:25:53.993Z at file:///usr/src/app/src/main.js:137:17 2024-07-18T06:25:54.015Z at Array.map () 2024-07-18T06:25:54.017Z at file:///usr/src/app/src/main.js:136:33 2024-07-18T06:25:54.018Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { 2024-07-18T06:25:54.019Z code: 'ERR_INVALID_URL', 2024-07-18T06:25:54.020Z input: '[object Object]' 2024-07-18T06:25:54.021Z } 2024-07-18T06:25:54.021Z 2024-07-18T06:25:54.022Z Node.js v20.14.0

timgreen avatar

There was a bug that's now fixed related to individual URLs. However, this feature is meant to scrape individual article URLs, so entering the base url won't work. See my comment on your other issue for next steps.

Developer
Maintained by Community

Actor Metrics

  • 34 monthly users

  • 6 stars

  • >99% runs succeeded

  • 12 days response time

  • Created in May 2024

  • Modified 5 months ago

Categories