data:image/s3,"s3://crabby-images/a5721/a5721541b5b7aa0a5c69bad49348555cf20535c1" alt="Reddit Scraper avatar"
Reddit Scraper
1 day trial then $45.00/month - No credit card required now
data:image/s3,"s3://crabby-images/a5721/a5721541b5b7aa0a5c69bad49348555cf20535c1" alt="Reddit Scraper"
Reddit Scraper
1 day trial then $45.00/month - No credit card required now
Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.
returning 0 comments, tested a few different subreddits
output from my code:
Scraping r/devops...
Debug: Found 20 total items in dataset Debug: Processing item: 'Getting into DevOps' with 0 comments Debug: Processing item: No title with 0 comments Debug: Processing item: No title with 0 comments Debug: Processing item: How should this sub respond to reddit's api changes, part 2 with 0 comments Debug: Processing item: No title with 0 comments Successfully scraped 0 comments from r/devops
First few rows of the scraped data: Empty DataFrame Columns: [] Index: []
data:image/s3,"s3://crabby-images/e5d21/e5d21aa625177db4498408d8f773ef2476a31ffa" alt="trudax avatar"
Can you share the run id?
harshhellohello
AcOrjJHODyGFbkinv
I've shared access to all runs from my account
Thank you!
harshhellohello
or can you give me sample code that I can try out to fetch comments for posts in a specific subreddit?
data:image/s3,"s3://crabby-images/e5d21/e5d21aa625177db4498408d8f773ef2476a31ffa" alt="trudax avatar"
You have to increase the maxItems
value in your input. Right now, it is the same as the number of posts, so the run will stop once 20 items are stored.
harshhellohello
I'm still having trouble, can you give me sample code that I can try out to fetch comments for posts in a specific subreddit?
data:image/s3,"s3://crabby-images/e5d21/e5d21aa625177db4498408d8f773ef2476a31ffa" alt="trudax avatar"
I don't know what you mean by sample code. You mean the input for the actor?
harshhellohello
yes, something that I can build on.
data:image/s3,"s3://crabby-images/e5d21/e5d21aa625177db4498408d8f773ef2476a31ffa" alt="trudax avatar"
1{ 2 "startUrls": [ 3 { 4 "url": "https://www.reddit.com/r/devops/hot/" 5 } 6 ], 7 "sort": "top", 8 "maxItems": 20000, 9 "maxPostCount": 20, 10 "maxComments": 10, 11 "maxCommunitiesCount": 1, 12 "maxUserCount": 100, 13 "scrollTimeout": 60, 14 "proxy": { 15 "useApifyProxy": true, 16 "apifyProxyGroups": [ 17 "RESIDENTIAL" 18 ] 19 }, 20 "searchFilter": "top", 21 "skipDailyThreads": false, 22 "skipComments": false, 23 "skipUserPosts": false, 24 "skipCommunity": false, 25 "searchPosts": true, 26 "searchComments": false, 27 "searchCommunities": false, 28 "searchUsers": false, 29 "includeNSFW": true, 30 "debugMode": false 31}
harshhellohello
Great thanks, I've got it working.
Just one more thing, even with the settings below it outputs exactly the same content as with the settings you shared above (only outputs comments from two posts). Thanks!
How come I'm not able to output let's say all comments from the last 750 posts?
run_input = { "startUrls": [ { "url": "https://www.reddit.com/r/devops/hot/" } ], "sort": "top", "maxItems": 20000, "maxPostCount": 750, "maxComments": 1000, "maxCommunitiesCount": 1, "maxUserCount": 1000, "scrollTimeout": 120, "proxy": { "useApifyProxy": True, "apifyProxyGroups": [ "RESIDENTIAL" ] }, "searchFilter": "top", "skipDailyThreads": False, "skipComments": False, "skipUserPosts": False, "skipCommunity": False, "searchPosts": True, "searchComments": False, "searchCommunities": False, "searchUsers": False, "includeNSFW": True, "debugMode": True }
Actor Metrics
360 monthly users
-
82 bookmarks
>99% runs succeeded
4.4 days response time
Created in Feb 2022
Modified 2 days ago