Reddit Scraper
1 day trial then $45.00/month - No credit card required now
Reddit Scraper
1 day trial then $45.00/month - No credit card required now
Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.
Even when I set all limits to 5000000, this doesn't scrape all the posts. It only goes back about a year and 2-3 months, even in communities that have posts going back 5-7 years. I tried changing the URL to the "top of all time" and that gets far more results, but still doesn't get the full community.
It also doesn't work for user comments or posts.
The community URL that won't go past about a year: https://www.reddit.com/r/subreddit/
Top of all time format: https://www.reddit.com/r/subreddit/top/?t=all
Can you share a runID of this?
Here are two runs I tested: DQWz17oMJy3HD0MUn dEr1rbSY1Vq8SvBkX
I get the same results no matter what the limits are set to, and the only reason these runs got so many results is because I used the "Top of all time" format. When scraping a community's normal URL, it stops at around 3,000 results and goes back a year maybe. If you can fix this to scrape a full community, I bet I can send a lot of people your way. Thousands of us are struggling since Reddit revoked their API usage from pushshift and there is no other scraper with a GUI :)
My scraper only returns what the Reddit webpage provides. The scraper returned all results until the last page so Reddit can be limiting the data on their site. I will try to find an alternative but this could take some time.
Oh I see! That makes sense, and is helpful to know. Thank you!
- 329 monthly users
- 47 stars
- 99.9% runs succeeded
- 1.2 days response time
- Created in Feb 2022
- Modified 15 days ago