Github Profile Scraper avatar

Github Profile Scraper

Try for free

3 days trial then $30.00/month - No credit card required now

Go to Store
Github Profile Scraper

Github Profile Scraper

saswave/github-profile-scraper
Try for free

3 days trial then $30.00/month - No credit card required now

GitHub User Profile Scraper. Extracts data from GitHub profiles, including followers, following, LinkedIn, Twitter, achievements and much more. Ideal for developers, researchers, and marketers. From a list of Github profile or a repository stargazers link

LE

Starts over after resurrect

Closed

aleksandrmoshkov opened this issue
a month ago

Please see my run. After the timeout, the restart began from the beginning. I have nearly 4,000 URLs, and any failure will cost a lot of usage and time.

LE

aleksandrmoshkov

a month ago

UPD: no, not from the beginning. But for some reason the first run didn't start from the beginning of the list, but after resurrecting the actor it started from the first line.

LE

aleksandrmoshkov

a month ago

UPD 2: No, he started over after all. I'm pausing for now, waiting for your response.

saswave avatar

Thank you for reporting the issue, we are working on it to add a queue system that will handle container migration (allows to restart from where it stop instead of re starting from 0)

LE

aleksandrmoshkov

a month ago

Thank you for the quick response! What are your timelines for the queue system? Meanwhile, what do you recommend? Should I split the processing of 4000 accounts into 8-10 parts?

saswave avatar

Not sure since Infra is 100% managed by apify and a migration event can be triggered anytime

Actor should be updated by the end of morning (French time)

LE

aleksandrmoshkov

a month ago

Actor should be updated by the end of morning (French time)

Do you mean today?

saswave avatar

Actor has been updated, have a try and you can close the issue if the 4000 accounts have been scraped successfully

LE

aleksandrmoshkov

a month ago

Tried. I would say it was done; however, I resurrected it once and it ended, but with an error.

kj7oeyVgCDasn2Vvo

saswave avatar

running with your input and trying to reproduce the issue

saswave avatar

Did you remove the timeout limit from the actor settings ? (or increase the default limit, it's 3600 seconds)

saswave avatar

Did you build the actor before start ? maybe you started the run without the code being updated , it's running, 450/4000

Will get back to you if it throw an error not being handled before the last url 4000

saswave avatar

I timed out after 1h, with 2000+ profile scraped, i resurrected the run and it start from where it stoped

LE

aleksandrmoshkov

a month ago

Did you remove the timeout limit from the actor settings ? (or increase the default limit, it's 3600 seconds)

No, it was default.

Did you build the actor before start ? maybe you started the run without the code being updated , it's running, 450/4000

Not sure here, tbh.

I timed out after 1h, with 2000+ profile scraped, i resurrected the run and it start from where it stoped

Yes, last time after your update it successfully continued after resurrection. And I checked the final result. The list is clear and tidy, without missing lines. Thank you! I think the issue is closed now.

Developer
Maintained by Community

Actor Metrics

  • 3 monthly users

  • 3 stars

  • >99% runs succeeded

  • 0.77 hours response time

  • Created in Mar 2024

  • Modified a month ago