Reddit Scraper avatar
Reddit Scraper

Pricing

$45.00/month + usage

Go to Store
Reddit Scraper

Reddit Scraper

Developed by

Gustavo Rudiger

Gustavo Rudiger

Maintained by Community

Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

3.9 (2)

Pricing

$45.00/month + usage

120

Total users

7.1K

Monthly users

464

Runs succeeded

>99%

Issues response

2.4 days

Last modified

15 days ago

LN

won't collect beyond a certain limit

Closed

lemon_normalcy opened this issue
a year ago

I've tried two different runs now of trying to collect a significant portion of a subreddit's activity (2+ years), but the scraper stops after about 50,000 lines of data

runs: https://li2vbaesoevb.runs.apify.net and https://cargcpdn5cbn.runs.apify.net

one further detail or question - it'd be great to pick up where a previous run stopped. is there some way to use the previous pagination marker to start collecting where it left off?

trudax avatar

can you share the run ID?

LN

lemon_normalcy

a year ago

I think they're both in the original post aren't they? Those 2 links?

trudax avatar

I don't think so, there is no usefull information when I click on those links. Ff you go to the run you should be able to see a share button on the right upper side that will provide you with the correct link.

LN

lemon_normalcy

a year ago
LN

lemon_normalcy

a year ago

what do you think?

trudax avatar

I will try to replicate the run here but at first glance, I don't see anything wrong besides some blocked request, but that is expected.

LN

lemon_normalcy

a year ago

okay - what about picking up a new run where the old one left off though? currently there isn't an option to do that.

LN

lemon_normalcy

a year ago

what do you think?

trudax avatar

I have got the same results, seems like it is returning everything that it can.

LN

lemon_normalcy

a year ago

right - but my question is why we can't start the scraping at somewhere other than the beginning of the subreddit?

trudax avatar

You can, you just need to know the last URL used for paginating.

LN

lemon_normalcy

a year ago

Okay. How do I implement that within this actor then?

trudax avatar

I will add a change to log the last page so you can copy from the logs and use it.

LN

lemon_normalcy

a year ago

okay - where do I paste in the last URL used?

LN

lemon_normalcy

a year ago

hello?

trudax avatar

the last usedr url is on the logs