
Super Fast Google News Scraper (pay per result)
Pricing
$1.00 / 1,000 results

Super Fast Google News Scraper (pay per result)
Efficiently extract direct links to the latest Google News articles from the past 24 hours.
0.0 (0)
Pricing
$1.00 / 1,000 results
47
Total users
765
Monthly users
152
Runs succeeded
96%
Issues response
22 hours
Last modified
13 days ago
Google link comes as "sorry, unusual activity detected"
Closed
Hi there,
first of all a big thank you for adding the directLink option :)
tried to use it but sometimes it'll return links like https:// www.google.com/sorry/index?continue=https://news.google.com/articles/... saying that google detected "unusual activity"
is there a way around it? proxies to rotate the ip or?
Thanks for your feedback! I’ll be adding a proxy option ASAP in the next update. For now, I’ve set a 5-second delay to help prevent this error. Let me know if you run into any issues!
enviable_lighter
thanks for the quick reply Alwin 🙏
if you need a sample run/dataset to see the result you can take a look at 1aYdqi1B4CGHmyYsN
please let me know once you've deployed the update I'll test and let you know if still seeing the issue
enviable_lighter
hey Alwin, I've been testing it and works much better already
in the dataset 2M0wEhaG69dNcz1KI some of the URLs managed to get through as Google ones (not the "sorry" one, just google link to the article) - hopefully useful feedback, I'll try to omit those that still come through as news.google .com but maybe it helps you identify what's going on there?
enviable_lighter
hey Alwin, in my recent tests more and more non direct links managed to sneak in even though I ran it with directLink set to true, any ideas?
I’ve updated the scraper with a proxy option, which should improve the accuracy. Let me know if you see any changes or if further tweaks are needed!
enviable_lighter
thanks Alwin, one of my workflows failed with an error saying ""message": "Input is not valid: Field input.proxy is required, Field inp (truncated...)"
might want to default to false so it doesn't break people's existing workflows? :)
Thanks for pointing that out! I've updated the proxy setting so it's now optional instead of required. It should no longer cause issues with existing workflows.
The scraper issue has been successfully resolved with today's update. It's now running smoothly, ensuring efficient and reliable data collection.
enviable_lighter
Hi Alwin,
Thank you and sorry for delay in getting back to you. Please see run 11COJQWe4UEaS7MX9 it is from an existing workflow but seems to be throwing errors (I can see proxy is now optional, but my existing workflow that used to work is now broken):
2025-06-06T10:00:25.243Z Will run command: xvfb-run -a -s "-ac -screen 0 1920x1080x24+32 -nolisten tcp" /bin/sh -c npm start --silent
2025-06-06T10:00:26.371Z INFO System info {"apifyVersion":"3.4.2","apifyClientVersion":"2.12.4","crawleeVersion":"3.13.4","osType":"Linux","nodeVersion":"v20.19.2"}
2025-06-06T10:00:27.029Z Crawler started.
2025-06-06T10:00:30.924Z /home/myuser/node_modules/ow/dist/index.js:36
2025-06-06T10:00:30.925Z (0, test_1.default)(value, labelOrPredicate, predicate);
2025-06-06T10:00:30.926Z ^
2025-06-06T10:00:30.926Z
2025-06-06T10:00:30.927Z ArgumentError: Expected property proxyConfiguration
to be of type object
but received type null
2025-06-06T10:00:30.928Z Expected argument 'object proxyConfiguration
' to be a ProxyConfiguration, got something else. in object PuppeteerCrawlerOptions
2025-06-06T10:00:30.928Z at ow (/home/myuser/node_modules/ow/dist/index.js:36:24)
2025-06-06T10:00:30.929Z at new PuppeteerCrawler (/home/myuser/node_modules/@crawlee/puppeteer/internals/puppeteer-crawler.js:78:26)
2025-06-06T10:00:30.930Z at file:///home/myuser/src/main.js:29:21
2025-06-06T10:00:30.930Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
2025-06-06T10:00:30.931Z validationErrors: Map(1) {
2025-06-06T10:00:30.932Z 'PuppeteerCrawlerOptions' => Set(1) {
2025-06-06T10:00:30.933Z 'Expected property proxyConfiguration
to be of type object
but received type null
\n' +
2025-06-06T10:00:30.933Z "Expected argument 'object proxyConfiguration
' to be a ProxyConfiguration, got something else. in object PuppeteerCrawlerOptions
"
2025-06-06T10:00:30.934Z }
2025-06-06T10:00:30.934Z }
2025-06-06T10:00:30.935Z }