Reddit Scraper avatar
Reddit Scraper

Pricing

$45.00/month + usage

Go to Store
Reddit Scraper

Reddit Scraper

Developed by

Gustavo Rudiger

Gustavo Rudiger

Maintained by Community

Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

3.9 (2)

Pricing

$45.00/month + usage

115

Total users

6.9K

Monthly users

441

Runs succeeded

>99%

Issues response

2.2 days

Last modified

an hour ago

JO

Fails on communities

Closed

jonbarrow opened this issue
a month ago

Seems to fail when scraping a community. Using default settings. The Lite version does not have this issue. Log:

2025-05-05T03:37:46.113Z ACTOR: Pulling Docker image of build S4UiFDwYiCKgmZtw2 from registry.
2025-05-05T03:37:46.201Z ACTOR: Creating Docker container.
2025-05-05T03:37:46.452Z ACTOR: Starting Docker container.
2025-05-05T03:37:48.482Z INFO System info {"apifyVersion":"3.4.0","apifyClientVersion":"2.12.3","crawleeVersion":"3.13.2","osType":"Linux","nodeVersion":"v20.19.1"}
2025-05-05T03:37:48.667Z INFO Found startUrl. Search params will be ignored.
2025-05-05T03:37:49.366Z INFO Starting the crawl.
2025-05-05T03:37:49.517Z INFO PuppeteerCrawler: Starting the crawler.
2025-05-05T03:37:52.830Z INFO Processing https://www.reddit.com/r/worldnews/?include_over_18=on ...
2025-05-05T03:37:56.879Z WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. SyntaxError: Unexpected token '<', "<body clas"... is not valid JSON
2025-05-05T03:37:56.881Z at JSON.parse (<anonymous>)
2025-05-05T03:37:56.883Z at getURLDataAsJSON (file:///home/myuser/src/tools.js:577:15)
2025-05-05T03:37:56.885Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2025-05-05T03:37:56.887Z at async Promise.all (index 1)
2025-05-05T03:37:56.898Z at async communityJSONParser (file:///home/myuser/src/parsers/communityJSONParser.js:20:35)
2025-05-05T03:37:56.903Z at async handleCommunityInfo (file:///home/myuser/src/routes.js:28:21)
2025-05-05T03:37:56.905Z at async wrap (/home/myuser/node_modules/@apify/timeout/cjs/index.cjs:54:21) {"id":"7htqwWEFkKdWZ4s","url":"https://www.reddit.com/r/worldnews/.json?include_over_18=on","retryCount":1}
2025-05-05T03:37:58.726Z INFO Processing https://www.reddit.com/r/worldnews/?include_over_18=on ...
2025-05-05T03:38:04.491Z WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. SyntaxError: Unexpected token '<', "<body clas"... is not valid JSON
2025-05-05T03:38:04.498Z at JSON.parse (<anonymous>)
2025-05-05T03:38:04.500Z at getURLDataAsJSON (file:///home/myuser/src/tools.js:577:15)
2025-05-05T03:38:04.503Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2025-05-05T03:38:04.505Z at async Promise.all (index 0)
2025-05-05T03:38:04.506Z at async communityJSONParser (file:///home/myuser/src/parsers/communityJSONParser.js:20:35)
2025-05-05T03:38:04.508Z at async handleCommunityInfo (file:///home/myuser/src/routes.js:28:21)
2025-05-05T03:38:04.511Z at async wrap (/home/myuser/node_modules/@apify/timeout/cjs/index.cjs:54:21) {"id":"7htqwWEFkKdWZ4s","url":"https://www.reddit.com/r/worldnews/.json?include_over_18=on","retryCount":2}
2025-05-05T03:38:05.845Z INFO Processing https://www.reddit.com/r/worldnews/?include_over_18=on ...
2025-05-05T03:38:33.189Z ACTOR: The Actor run was aborted by the user.
trudax avatar

Is the issue still happening for you? I was able to scrape using the same input as your run.