Reddit Scraper avatar
Reddit Scraper

Pricing

$45.00/month + usage

Go to Store
Reddit Scraper

Reddit Scraper

Developed by

Gustavo Rudiger

Gustavo Rudiger

Maintained by Community

Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

3.9 (2)

Pricing

$45.00/month + usage

120

Total users

7.2K

Monthly users

432

Runs succeeded

>99%

Issues response

2.7 days

Last modified

19 days ago

R1

Full communities aren't being scraped

Closed

researcher_1999 opened this issue
2 years ago

Even when I set all limits to 5000000, this doesn't scrape all the posts. It only goes back about a year and 2-3 months, even in communities that have posts going back 5-7 years. I tried changing the URL to the "top of all time" and that gets far more results, but still doesn't get the full community.

It also doesn't work for user comments or posts.

The community URL that won't go past about a year: https://www.reddit.com/r/subreddit/

Top of all time format: https://www.reddit.com/r/subreddit/top/?t=all

trudax avatar

Can you share a runID of this?

R1

researcher_1999

2 years ago

Here are two runs I tested: DQWz17oMJy3HD0MUn dEr1rbSY1Vq8SvBkX

I get the same results no matter what the limits are set to, and the only reason these runs got so many results is because I used the "Top of all time" format. When scraping a community's normal URL, it stops at around 3,000 results and goes back a year maybe. If you can fix this to scrape a full community, I bet I can send a lot of people your way. Thousands of us are struggling since Reddit revoked their API usage from pushshift and there is no other scraper with a GUI :)

trudax avatar

My scraper only returns what the Reddit webpage provides. The scraper returned all results until the last page so Reddit can be limiting the data on their site. I will try to find an alternative but this could take some time.

R1

researcher_1999

2 years ago

Oh I see! That makes sense, and is helpful to know. Thank you!