Instagram Scraper avatar

Instagram Scraper

Under maintenance

Pricing

from $0.50 / 1,000 results

Go to Apify Store
Instagram Scraper

Instagram Scraper

Under maintenance

Standalone Instagram scraper for profile feeds, direct post/reel URLs, profile details, and visible comments. Uses Apify Proxy, supports lower-bandwidth scraping by blocking heavy media resources

Pricing

from $0.50 / 1,000 results

Rating

0.0

(0)

Developer

insomniac dev

insomniac dev

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

3

Monthly active users

14 days ago

Last modified

Share

Instagram Standalone Post/Profile Scraper

Standalone Apify Actor for scraping Instagram profile feeds, direct post/reel links, profile details, and visible/latest comments.

This actor is designed to be compatible with the post/profile scraping workflow used in fabulate-dagster while avoiding the wrapper approach.

Key points

  • standalone scraper, not a wrapper to apify/instagram-scraper
  • uses Apify Proxy by default
  • blocks images, media, fonts, and stylesheets during page load to reduce bandwidth and proxy spend
  • supports profile URLs and direct post/reel URLs
  • uses an API-first profile feed path and only uses DOM comment extraction on direct post/reel pages
  • keeps output compatible with the fields used by the Dagster Instagram post pipeline

Supported input modes

Currently supported:

  • resultsType: "posts"
  • resultsType: "details"
  • resultsType: "comments"
  • resultsType: "reels"
  • directUrls

Currently not supported in standalone mode:

  • search-only scraping
  • mentions
  • stories
  • hashtag/place discovery

If no directUrls are provided, the actor fails with a clear error.


Input example

{
"resultsType": "posts",
"directUrls": [
"https://www.instagram.com/instagram/"
],
"resultsLimit": 3,
"addParentData": true,
"skipPinnedPosts": true,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Proxy behavior

If you do not pass proxyConfiguration, the actor defaults to:

{
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}

This is recommended for Instagram on Apify cloud. Datacenter IPs are more likely to get redirected to the Instagram login page.


Output

The actor emits Instagram-like result objects for:

  • profile feed posts
  • direct post/reel URLs
  • profile detail objects
  • comment objects

For profile feeds, latestComments comes from timeline/preview data when available. For direct post/reel URLs and resultsType="comments", the actor extracts the visible comments rendered on the page.

Reference schemas are documented here:

  • docs/schemas/instagram-output.schema.json
  • docs/schemas/instagram-post.schema.json
  • docs/schemas/instagram-comment.schema.json
  • docs/schemas/instagram-profile.schema.json
  • docs/schemas/instagram-hashtag.schema.json
  • docs/schemas/instagram-place.schema.json

These schema files are documentation aids. The actual actor dataset remains permissive.


Why page loading is cheaper

This standalone actor blocks these resource types during navigation:

  • images
  • media
  • fonts
  • stylesheets

It also blocks common Instagram CDN media hosts.

That means page HTML and JavaScript still load, but heavy media payloads do not.


Local run

npm install
apify run --input-file=input.json

Dagster compatibility scope

This actor is intended to replace the post/profile scraping use of apify/instagram-scraper in fabulate_dagster/ops/scrape_insta.py.

It is not intended to replace the separate Instagram story actor.


Testing

See:

  • docs/TESTING.md

Technical details

See:

  • docs/IMPLEMENTATION.md

License

ISC