Instagram Tracker
Pricing
from $0.01 / 1,000 results
Instagram Tracker
This actor collects public Instagram content with a browser-based approach that mirrors real user behaviour. It works with profiles, posts, reels, and stories, and returns a consistent JSON structure you can feed into automation, analytics, or reporting workflows.
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer

vivid travelogue
Actor stats
2
Bookmarked
3
Total users
2
Monthly active users
6 days ago
Last modified
Categories
Share
The Instagram Tracker is a powerful and reliable Instagram scraper built as an Apify Actor. It automates the collection of public Instagram data, including posts, reels, stories, and profile insights. Designed for accuracy and long-term reliability, it uses real browser automation to reduce blocking and adapt to Instagram’s changing structure.
This actor uses Business Account making it suitable for use cases that require authenticated access. It supports multiple fallback scraping modes by combining official web endpoints, mobile API endpoints, and JSON structures embedded inside Instagram pages. This helps maintain stable output even when Instagram adjusts its API or HTML.
You can customize how many posts, reels, and stories to collect per profile. Each media item is normalized into a clean, consistent dataset with the essential fields: media ID, shortcode, URL, thumbnail, video source, caption, timestamps, and more. The original Instagram payload is preserved inside a raw object for advanced analysis.
Optional raw filtering lets you target items with specific characteristics such as video duration, play count, audio presence, engagement metrics, or any available raw field. You can also enable media downloading to save images and videos directly to the Key-Value Store.
The actor exports results in JSON, CSV, and the Apify Dataset, making it easy to integrate with automation tools, dashboards, or machine-learning workflows.
Ideal for social media analytics, competitor research, influencer monitoring, content archiving, and Instagram automation, the Instagram Tracker is a scalable, SEO-friendly solution for structured Instagram data extraction.
Use it to streamline research, power content strategies, or build automated social intelligence systems.
Features
- Scrape Instagram posts, profiles, tags, and places.
- Extract metadata from
og:*meta tags and embedded JSON (__NEXT_DATA__,ld+json). - Optionally download media to Key-Value store (
{outputFileName}/media/...). - Export results as JSON and/or CSV and push items to the dataset.
Input (high-level)
Configure runtime options in storage\key_value_stores\default\INPUT.json. Example values below are taken from the repository's sample INPUT.json.
startUrls(array): array of Instagram URLs to crawl. Default:[].usernames(array): list of Instagram usernames to explicitly fetch (e.g.["nick_saraev"]). Default:[].queries(array): optional array of search queries (unused by default). Default:[].maxItems(integer): maximum number of total items to collect. Sample:25.numberOfPosts(integer): per-user number of posts to fetch. Sample:3.numberOfReels(integer): per-user number of reels to fetch. Sample:3.numberOfStories(integer): per-user number of stories to fetch. Sample:3.downloadMedia(boolean): whether to download media into the Key-Value store. Sample:true.mediaTypes(string): which media to download —image,video, orboth. Sample:both.useApifyProxy(boolean): use Apify Proxy for requests (recommended when scraping at scale). Sample:false.apifyProxyCountry(string): two-letter country code for Apify Proxy routing (when enabled). Sample:US.maxConcurrency(integer): Playwright concurrency/workers. Sample:2.requestTimeoutSecs(integer): per-request timeout in seconds. Sample:60.outputFormat(string): which outputs to write:json,csv, orboth. Sample:both.outputFileName(string): base filename/key for outputs. Sample:instagram-output.rawFilter(array or object): optional filter applied to therawnode data.- The actor supports two filter shapes:
- Object map:
{ "path.to.field": <value>, ... }(strings may be prefixed withre:for regex orcontains:for substring). - Array of rules:
[{ "path": "path.to.field", "op": "eq|contains|re|gt|lt|exists", "value": ... }, ...].
- Object map:
- The sample
INPUT.jsonin this repo uses a simple array-of-keys/paths (e.g.["has_audio","video_duration","play_count",...]) meaning "only include items that contain theserawkeys".
- The actor supports two filter shapes:
When present, rawFilter is applied to each item's raw object and only matching items are included in the final dataset and exported outputs.
See .actor/input_schema.json for the authoritative schema used by the Apify Console.
Output
- JSON: saved to the Key-Value store as
{outputFileName}.json. - CSV: saved as
{outputFileName}.csvwhen requested. - Media: saved under the Key-Value store prefix
{outputFileName}/media/whendownloadMediais enabled.
Normalized item fields
Each dataset item contains a small set of normalized top-level fields plus the original platform payload under raw.
id: internal item id (usually{mediaId}_{ownerId})shortcode: Instagram shortcode (post code)url: canonical web URL for the postis_video: boolean flag if the post contains videodisplay_url: preferred image thumbnail URLvideo_url: best-effort direct MP4 URL (when available)sourceTab: source type (e.g.reels,profile)sourceProfile: source usernamecrawledAt: ISO timestamp when item was crawledraw: full original JSON object returned by Instagram endpoints (very large, nested)
Common raw paths (useful with rawFilter)
Use dot paths into the raw object for rawFilter rules. Common keys present in dataset examples:
raw.id— internal media id stringraw.pk— numeric PK for the mediaraw.fbid— platform FB id (when present)raw.code— shortcode stringraw.is_dash_eligible— numeric/video eligibility flagraw.media_type— numeric media type (1=image, 2=video, ...)raw.is_video— (sometimes provided at normalized level; checkraw.media_type)raw.play_count/raw.ig_play_count— view counts for videosraw.video_duration— video duration in secondsraw.number_of_qualities— available quality countraw.taken_at— Unix timestamp when media was takenraw.like_count— number of likesraw.comment_count— comment countraw.caption.text— caption textraw.caption.user.username— caption author usernameraw.user.username— owner's usernameraw.user.pk/raw.user.id— owner id fieldsraw.owner.username— owner username (duplicate ofuserin many responses)raw.image_versions2.candidates[].url— candidate image URLs (array)raw.image_versions2.additional_candidates.first_frame.url— first-frame preview URLraw.video_versions[].url— array of direct video URLs (may be multiple)raw.video_dash_manifest— DASH MPD XML (string)raw.original_sound_info.dash_manifest— audio DASH MPD XMLraw.scrubber_spritesheet_info_candidates.default.sprite_urls[]— thumbnail sprite URLs
Notes:
- Many fields are nested and arrays use canonical names like
candidates[]orvideo_versions[]. Use bracket notation for index-specific checks inrawFilter(e.g.image_versions2.candidates[0].url). - Not all responses contain every field. Prefer checking for existence (or using
op: "exists") when building filters. - The
rawobject is large and can include platform-specific manifests, metrics, and nested user objects. TherawFilterengine supports equality, contains, regex, and basic numeric ops.
Notes
- Instagram actively blocks automated access. Use low concurrency and consider using Apify Proxy for reliable runs.
- This actor is intended as a general-purpose starter; deep pagination of comments or API-level scraping is out-of-scope in this initial version.
Example Input
{"maxItems": 5,"downloadMedia": true,"mediaTypes": "both","useApifyProxy": false,"apifyProxyCountry": "US","maxConcurrency": 2,"requestTimeoutSecs": 60,"outputFormat": "both","outputFileName": "instagram-output","startUrls": [],"queries": [],"numberOfPosts": 3,"numberOfReels": 3,"numberOfStories": 3,"usernames": ["nick_saraev"],"rawFilter": ["has_audio","video_duration", "play_count", "has_liked","commerciality_status","like_count","comment_count"]}
Configuration Summary
This section documents the Instagram actor's configuration options (sample defaults taken from storage\key_value_stores\default\INPUT.json). These are the primary inputs you can set when running the Actor locally or in the Apify Console.
| Field | Description |
|---|---|
startUrls | Array of Instagram URLs to crawl (e.g. profile/post/tag pages). Default: [] |
usernames | Array of Instagram usernames to fetch directly (e.g. ["nick_saraev"]). Default: [] |
queries | Optional free-text queries (not used by default). Default: [] |
maxItems | Maximum number of total items to collect (across users). Sample: 5 |
numberOfPosts | Per-user number of posts to fetch. Sample: 3 |
numberOfReels | Per-user number of reels to fetch. Sample: 3 |
numberOfStories | Per-user number of stories to fetch. Sample: 3 |
downloadMedia | When true, downloads media into the Key-Value store under {outputFileName}/media/. Sample: true |
mediaTypes | Which media to download — image, video, or both. Sample: both |
useApifyProxy | When true, use Apify Proxy for requests (recommended for large runs). Sample: false |
apifyProxyCountry | Two-letter country code for Apify Proxy routing (when enabled). Sample: US |
maxConcurrency | Playwright concurrency/workers. Sample: 2 |
requestTimeoutSecs | Per-request timeout in seconds. Sample: 60 |
outputFormat | Which outputs to write: json, csv, or both. Sample: both |
outputFileName | Base filename/key for outputs. JSON and CSV are written using this base (sample: instagram-output) |
rawFilter | Optional filter applied to the raw node data. Can be an object map or an array of rules (see README rawFilter docs). Sample: array of keys ["has_audio","video_duration",...] |
Notes:
- Prefer low
maxConcurrencyanduseApifyProxy: truefor larger or long-running runs to reduce blocking. rawFiltercan be used to include only items that match providedraw-field checks (exists/equality/contains/regex/ numeric ops).- See
.actor/input_schema.jsonfor the Console schema and field validations.
Output
- Dataset (
OUTPUT) — structured JSON objects output.csv— exported CSV
Each dataset object includes the normalized Instagram fields plus the original platform payload under raw. Example item:
{"id": "1234567890","shortcode": "Ck1aB2XhYz","url": "https://www.instagram.com/p/Ck1aB2XhYz/","is_video": true,"display_url": "https://scontent.cdninstagram.com/v/t51.2885-15/......jpg","video_url": "https://r2---sn-xxx.googlevideo.com/videoplayback?....mp4","sourceTab": "posts","sourceProfile": "nick_saraev","raw": {"id": "1234567890_0987654321","pk": 1234567890,"code": "Ck1aB2XhYz","is_video": 1,"media_type": 2,"video_versions": [{ "url": "https://...mp4" }],"image_versions2": { "candidates": [{ "url": "https://...jpg" }] },"play_count": 45231,"video_duration": 38.5,"like_count": 1024,"comment_count": 12,"caption": { "text": "Check out our new product!", "user": { "username": "brand_xyz" } },"user": { "username": "nick_saraev", "pk": 987654321 }},"crawledAt": "2025-11-26T07:32:47.389Z"}
---## Proxy NotesInstagram frequently blocks bots.For stable scraping:```json"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]
Set token locally:
$export APIFY_TOKEN="your-token"
License
No license specified.