Reddit Scraper avatar

Reddit Scraper

Try for free

1 day trial then $45.00/month - No credit card required now

Go to Store
Reddit Scraper

Reddit Scraper

trudax/reddit-scraper
Try for free

1 day trial then $45.00/month - No credit card required now

Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

HR

returning 0 comments, tested a few different subreddits

Closed
harshhellohello opened this issue
a month ago

output from my code:

Scraping r/devops...

Debug: Found 20 total items in dataset Debug: Processing item: 'Getting into DevOps' with 0 comments Debug: Processing item: No title with 0 comments Debug: Processing item: No title with 0 comments Debug: Processing item: How should this sub respond to reddit's api changes, part 2 with 0 comments Debug: Processing item: No title with 0 comments Successfully scraped 0 comments from r/devops

First few rows of the scraped data: Empty DataFrame Columns: [] Index: []

trudax avatar

Can you share the run id?

HR

harshhellohello

a month ago

AcOrjJHODyGFbkinv

I've shared access to all runs from my account

Thank you!

HR

harshhellohello

a month ago

or can you give me sample code that I can try out to fetch comments for posts in a specific subreddit?

trudax avatar

You have to increase the maxItems value in your input. Right now, it is the same as the number of posts, so the run will stop once 20 items are stored.

HR

harshhellohello

a month ago

I'm still having trouble, can you give me sample code that I can try out to fetch comments for posts in a specific subreddit?

trudax avatar

I don't know what you mean by sample code. You mean the input for the actor?

HR

harshhellohello

a month ago

yes, something that I can build on.

trudax avatar
1{
2  "startUrls": [
3    {
4      "url": "https://www.reddit.com/r/devops/hot/"
5    }
6  ],
7  "sort": "top",
8  "maxItems": 20000,
9  "maxPostCount": 20,
10  "maxComments": 10,
11  "maxCommunitiesCount": 1,
12  "maxUserCount": 100,
13  "scrollTimeout": 60,
14  "proxy": {
15    "useApifyProxy": true,
16    "apifyProxyGroups": [
17      "RESIDENTIAL"
18    ]
19  },
20  "searchFilter": "top",
21  "skipDailyThreads": false,
22  "skipComments": false,
23  "skipUserPosts": false,
24  "skipCommunity": false,
25  "searchPosts": true,
26  "searchComments": false,
27  "searchCommunities": false,
28  "searchUsers": false,
29  "includeNSFW": true,
30  "debugMode": false
31}
HR

harshhellohello

a month ago

Great thanks, I've got it working.

Just one more thing, even with the settings below it outputs exactly the same content as with the settings you shared above (only outputs comments from two posts). Thanks!

How come I'm not able to output let's say all comments from the last 750 posts?

run_input = { "startUrls": [ { "url": "https://www.reddit.com/r/devops/hot/" } ], "sort": "top", "maxItems": 20000, "maxPostCount": 750, "maxComments": 1000, "maxCommunitiesCount": 1, "maxUserCount": 1000, "scrollTimeout": 120, "proxy": { "useApifyProxy": True, "apifyProxyGroups": [ "RESIDENTIAL" ] }, "searchFilter": "top", "skipDailyThreads": False, "skipComments": False, "skipUserPosts": False, "skipCommunity": False, "searchPosts": True, "searchComments": False, "searchCommunities": False, "searchUsers": False, "includeNSFW": True, "debugMode": True }

Developer
Maintained by Community

Actor Metrics

  • 360 monthly users

  • 82 bookmarks

  • >99% runs succeeded

  • 4.4 days response time

  • Created in Feb 2022

  • Modified 2 days ago

Categories