Instagram Hashtag Scraper avatar
Instagram Hashtag Scraper
Try for free

No credit card required

View all Actors
Instagram Hashtag Scraper

Instagram Hashtag Scraper

apify/instagram-hashtag-scraper
Try for free

No credit card required

Scrape Instagram hashtags data. Just add one or more hashtags and extract posts, images, URLs, comments, likes, users, locations, timestamps, and more. Export scraped datasets, run the scraper via API, schedule and monitor runs or integrate with other tools.

User avatar

Scraper finish earlier.

Closed

congruent_spider opened this issue
5 months ago

Despite of settings with exact amount of requested output for each hashtag - scraper finish earlier with few amount of posts. Yesterday there was no such problem... If request 10k - it was output 10k. But today it can drop on 2k (no errors, and can't resurrect) while on insta exists more then million posts...

User avatar

Hi! Thanks for feedback, we don't know their (IG) condition for mac posts availability, however we know that they using antibot AI and therefore some hashtags might be less available sometimes. Will check your data case in more details and keep you informed about progress.

User avatar

For right now try scraping single hashtag per run, with 3 hashtags in same run posts might be just filtered out: if given post tagged with #A and #B scraper will save it just one time

User avatar

congruent_spider

5 months ago

Hi! Thanks for response! Yeah, I tried scraping of single hashtag per run (even before this multiple hashtags run) same results, for instance this run omennOT2YAuA1RJT1 I setup limit 10k, and got only 2075 records only (more then 1kk posts exists) and this eZbTrb3lYqey5Dh8Y same limit of 10k and 3071 results. same for VMXilgHqYin6LtRo1 1yL3yWLhrVsqmBNrB sIz1cUlZp5EtINGsa

User avatar

Hi, thanks for the runs, we will check them out specifically and let you know!

User avatar

Floating issue, false positive end of results per hashtag happens either because of proxy itself or antibot AI blocking. Solved by triple re-check for the end of results, sample run: https://console.apify.com/view/runs/MrP80GWz5GN3W3UJs Sample https://www.instagram.com/explore/tags/rayongsport with 142 posts by counter and 106 available without login. In other words hashtag used to quick check real end of results right from second scroll page. Added logging to monitor EndOfResults as follows:

12023-12-11T04:39:10.947Z INFO  CheerioCrawler: [RECHECK]: rayongsport end of results {"has_next_page":false,"end_cursor":null}
22023-12-11T04:39:12.981Z INFO  CheerioCrawler: [RECHECK]: rayongsport end of results {"has_next_page":false,"end_cursor":null}
32023-12-11T04:39:14.088Z INFO  CheerioCrawler: [RECHECK]: rayongsport end of results {"has_next_page":false,"end_cursor":null}
42023-12-11T04:39:18.501Z INFO  CheerioCrawler: [END-OF-RESULTS]: rayongsport confirmed end of results at pos 106

In case if blocking per hashtag will happen again we will get leads for further investigation in logs.

User avatar

congruent_spider

5 months ago

Thanks!

User avatar

I´m going to close the issue now, but if there would be anything else we could help with, please let us know.

Developer
Maintained by Apify
Actor metrics
  • 1.5k monthly users
  • 99.6% runs succeeded
  • 1.1 days response time
  • Created in Nov 2021
  • Modified 2 days ago