Instagram Post Scraper avatar
Instagram Post Scraper

Pricing

$2.30 / 1,000 results

Go to Store
Instagram Post Scraper

Instagram Post Scraper

apify/instagram-post-scraper

Developed by

Apify

Maintained by Apify

Scrape Instagram posts. Just add one or more Instagram usernames and get your data in seconds including text, hashtags, mentions, comments, images, URLs, likes, locations, and metadata. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

4.4 (30)

Pricing

$2.30 / 1,000 results

309

Monthly users

2.8k

Runs succeeded

>99%

Response time

19 hours

Last modified

an hour ago

id

caption

url

commentsCount

likesCount

firstComment

359***66

Books allow readers to *** at the link in bio.

https://www.instagram.com/p/DH***C/

41

9028

More about books 😍

359***88

Photos by @ingo*** about these pumas at the link in bio.

https://www.instagram.com/p/DH***o/

143

68801

❤️

359***22

Meet the Nat*** their stories at the link in bio.

https://www.instagram.com/p/DH***O/

136

13385

Omg!!! ❤️ @dzennypha_

The data above is synthetic and does not reflect real-world values. View full dataset

NI

Crawling performance significantly degraded with continuous scrolling feature

Closed
anicim48 opened this issue
11 days ago

"Since the introduction of the continuous scrolling feature, the crawling process has experienced a significant slowdown. Runs that previously completed in a reasonable timeframe now take 10-20 minutes or longer.

Based on the logs, it appears the direct URL scraping completes quickly and without errors. However, the post scraping phase is where the performance issue manifests. The logs show a large number of warnings related to the RequestQueue, specifically:

WARN RequestQueue(L7Iimtar5iBZhk0qD, no-name): Queue head returned a request that is already in progress?!

This suggests that requests are being processed multiple times, potentially due to an issue with queue management or asynchronous request handling. This duplication of effort likely contributes to the extended crawling times.

Additionally, the CheerioCrawler statistics indicate that the average request completion time is much higher during the post scraping phase compared to the direct URL scraping.

The AutoscaledPool does not seem to be increasing concurrency, indicating that the system does not perceive resource overload. However, the repeated processing of requests due to the RequestQueue issue could still be causing the performance bottleneck.

The actor run URL for reference is: https://console.apify.com/actors/runs/aoZXzEQgR46r4ephD

Could you please investigate this issue, focusing on the RequestQueue warnings and the performance of the post scraping phase with the continuous scrolling feature enabled?"

alexey avatar

Hi!

Fixed in the latest build: https://console.apify.com/view/runs/8wfIzVdC6XTprqjQo

I will close the issue now, but if there is anything else we can help with, please let us know.

Pricing

Pricing model

Pay per result 

This Actor is paid per result. You are not charged for the Apify platform usage, but only a fixed price for each dataset of 1,000 items in the Actor outputs.

Price per 1,000 items

$2.30