Super Fast Google News Scraper (pay per result) avatar
Super Fast Google News Scraper (pay per result)

Pricing

$1.00 / 1,000 results

Go to Store
Super Fast Google News Scraper (pay per result)

Super Fast Google News Scraper (pay per result)

Developed by

Alwin Morato

Alwin Morato

Maintained by Community

Efficiently extract direct links to the latest Google News articles from the past 24 hours.

0.0 (0)

Pricing

$1.00 / 1,000 results

47

Total users

765

Monthly users

152

Runs succeeded

96%

Issues response

22 hours

Last modified

13 days ago

EL

Google link comes as "sorry, unusual activity detected"

Closed

enviable_lighter opened this issue
a month ago

Hi there,

first of all a big thank you for adding the directLink option :)

tried to use it but sometimes it'll return links like https:// www.google.com/sorry/index?continue=https://news.google.com/articles/... saying that google detected "unusual activity"

is there a way around it? proxies to rotate the ip or?

aymorato avatar

Thanks for your feedback! I’ll be adding a proxy option ASAP in the next update. For now, I’ve set a 5-second delay to help prevent this error. Let me know if you run into any issues!

EL

enviable_lighter

a month ago

thanks for the quick reply Alwin 🙏

if you need a sample run/dataset to see the result you can take a look at 1aYdqi1B4CGHmyYsN

please let me know once you've deployed the update I'll test and let you know if still seeing the issue

EL

enviable_lighter

a month ago

hey Alwin, I've been testing it and works much better already

in the dataset 2M0wEhaG69dNcz1KI some of the URLs managed to get through as Google ones (not the "sorry" one, just google link to the article) - hopefully useful feedback, I'll try to omit those that still come through as news.google .com but maybe it helps you identify what's going on there?

EL

enviable_lighter

a month ago

hey Alwin, in my recent tests more and more non direct links managed to sneak in even though I ran it with directLink set to true, any ideas?

aymorato avatar

I’ve updated the scraper with a proxy option, which should improve the accuracy. Let me know if you see any changes or if further tweaks are needed!

EL

enviable_lighter

a month ago

thanks Alwin, one of my workflows failed with an error saying ""message": "Input is not valid: Field input.proxy is required, Field inp (truncated...)"

might want to default to false so it doesn't break people's existing workflows? :)

aymorato avatar

Thanks for pointing that out! I've updated the proxy setting so it's now optional instead of required. It should no longer cause issues with existing workflows.

aymorato avatar

The scraper issue has been successfully resolved with today's update. It's now running smoothly, ensuring efficient and reliable data collection.

EL

enviable_lighter

18 days ago

Hi Alwin,

Thank you and sorry for delay in getting back to you. Please see run 11COJQWe4UEaS7MX9 it is from an existing workflow but seems to be throwing errors (I can see proxy is now optional, but my existing workflow that used to work is now broken):

2025-06-06T10:00:25.243Z Will run command: xvfb-run -a -s "-ac -screen 0 1920x1080x24+32 -nolisten tcp" /bin/sh -c npm start --silent 2025-06-06T10:00:26.371Z INFO System info {"apifyVersion":"3.4.2","apifyClientVersion":"2.12.4","crawleeVersion":"3.13.4","osType":"Linux","nodeVersion":"v20.19.2"} 2025-06-06T10:00:27.029Z Crawler started. 2025-06-06T10:00:30.924Z /home/myuser/node_modules/ow/dist/index.js:36 2025-06-06T10:00:30.925Z (0, test_1.default)(value, labelOrPredicate, predicate); 2025-06-06T10:00:30.926Z ^ 2025-06-06T10:00:30.926Z 2025-06-06T10:00:30.927Z ArgumentError: Expected property proxyConfiguration to be of type object but received type null 2025-06-06T10:00:30.928Z Expected argument 'object proxyConfiguration' to be a ProxyConfiguration, got something else. in object PuppeteerCrawlerOptions 2025-06-06T10:00:30.928Z at ow (/home/myuser/node_modules/ow/dist/index.js:36:24) 2025-06-06T10:00:30.929Z at new PuppeteerCrawler (/home/myuser/node_modules/@crawlee/puppeteer/internals/puppeteer-crawler.js:78:26) 2025-06-06T10:00:30.930Z at file:///home/myuser/src/main.js:29:21 2025-06-06T10:00:30.930Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { 2025-06-06T10:00:30.931Z validationErrors: Map(1) { 2025-06-06T10:00:30.932Z 'PuppeteerCrawlerOptions' => Set(1) { 2025-06-06T10:00:30.933Z 'Expected property proxyConfiguration to be of type object but received type null\n' + 2025-06-06T10:00:30.933Z "Expected argument 'object proxyConfiguration' to be a ProxyConfiguration, got something else. in object PuppeteerCrawlerOptions" 2025-06-06T10:00:30.934Z } 2025-06-06T10:00:30.934Z } 2025-06-06T10:00:30.935Z }