Instagram Post Scraper avatar
Instagram Post Scraper

Pricing

$2.30 / 1,000 results

Go to Store
Instagram Post Scraper

Instagram Post Scraper

Developed by

Apify

Apify

Maintained by Apify

Scrape Instagram posts. Just add one or more Instagram usernames and get your data in seconds including text, hashtags, mentions, comments, images, URLs, likes, locations, and metadata. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

4.5 (32)

Pricing

$2.30 / 1,000 results

364

Total users

34k

Monthly users

5k

Runs succeeded

>99%

Issue response

2.3 days

Last modified

a day ago

NI

Crawling performance significantly degraded with continuous scrolling feature

Closed

anicim48 opened this issue
2 months ago

"Since the introduction of the continuous scrolling feature, the crawling process has experienced a significant slowdown. Runs that previously completed in a reasonable timeframe now take 10-20 minutes or longer.

Based on the logs, it appears the direct URL scraping completes quickly and without errors. However, the post scraping phase is where the performance issue manifests. The logs show a large number of warnings related to the RequestQueue, specifically:

WARN RequestQueue(L7Iimtar5iBZhk0qD, no-name): Queue head returned a request that is already in progress?!

This suggests that requests are being processed multiple times, potentially due to an issue with queue management or asynchronous request handling. This duplication of effort likely contributes to the extended crawling times.

Additionally, the CheerioCrawler statistics indicate that the average request completion time is much higher during the post scraping phase compared to the direct URL scraping.

The AutoscaledPool does not seem to be increasing concurrency, indicating that the system does not perceive resource overload. However, the repeated processing of requests due to the RequestQueue issue could still be causing the performance bottleneck.

The actor run URL for reference is: https://console.apify.com/actors/runs/aoZXzEQgR46r4ephD

Could you please investigate this issue, focusing on the RequestQueue warnings and the performance of the post scraping phase with the continuous scrolling feature enabled?"

alexey avatar

Hi!

Fixed in the latest build: https://console.apify.com/view/runs/8wfIzVdC6XTprqjQo

I will close the issue now, but if there is anything else we can help with, please let us know.