Instagram Dataset Media Downloader
Pricing
from $1.00 / 1,000 media
Instagram Dataset Media Downloader
Automatically downloads expiring media (videos and images) from Instagram Reel Scraper and Post Scraper datasets. Stores files in a named key-value store with full traceability back to the original reel or post.
Pricing
from $1.00 / 1,000 media
Rating
0.0
(0)
Developer
Vít Tuhý
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
14 hours ago
Last modified
Categories
Share
Instagram Reels and Posts Media Downloader
Downloads expiring media (videos and images) from Instagram Reel Scraper and Instagram Post Scraper datasets before the URLs expire.
Instagram CDN links typically expire within 7 days. This Actor grabs the media and stores it permanently in an Apify key-value store, with full traceability back to the original content.
How it works
- Reads a dataset by ID (from a Reel or Post Scraper run)
- Auto-detects whether the dataset came from the Reel Scraper or Post Scraper
- Downloads all media files (videos, images, carousel children)
- Stores each file in a named key-value store linked to the source dataset
- Creates a
_manifestJSON with metadata for every downloaded file
Source detection
| Source | Detected by | Fields downloaded |
|---|---|---|
| Instagram Reel Scraper | downloadedVideo field present | downloadedVideo (video) |
| Instagram Post Scraper | displayUrl field present | displayUrl (image), videoUrl (video, when present), plus all childPosts media for carousel/sidecar posts |
File naming in the key-value store
Every file is keyed by the Instagram shortcode so you can trace it back to the original content:
| Source | Media | Key format | Example |
|---|---|---|---|
| Reel Scraper | Video | reel-{shortCode} | reel-CxYz123 |
| Post Scraper | Image | post-{shortCode}-image | post-AbCd456-image |
| Post Scraper | Video | post-{shortCode}-video | post-AbCd456-video |
| Post Scraper | Carousel child image | post-{childShortCode}-image | post-EfGh789-image |
| Post Scraper | Carousel child video | post-{childShortCode}-video | post-EfGh789-video |
Manifest
A _manifest key is stored in the same key-value store. It contains a JSON array with metadata for every downloaded file:
[{"shortCode": "CxYz123","originalId": "3210987654321","ownerUsername": "natgeo","caption": "Amazing wildlife footage...","timestamp": "2026-06-20T12:00:00.000Z","postType": "Video","kvsKey": "reel-CxYz123","mediaType": "video","sourceUrl": "https://...","contentType": "video/mp4","sizeBytes": 5242880}]
For carousel children, the manifest also includes parentShortCode and childIndex.
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
datasetId | String | Yes | - | ID of the dataset to read media URLs from |
keyValueStoreName | String | No | media-{datasetId} | Custom name for the output key-value store |
concurrency | Integer | No | 5 | Number of parallel downloads (1-20) |
Example input
{"datasetId": "aXgbg1XUeBakPyWpT","concurrency": 10}
Output
The Actor pushes a summary object to its default dataset:
{"source": "reel-scraper","sourceDatasetId": "aXgbg1XUeBakPyWpT","keyValueStoreName": "media-aXgbg1XUeBakPyWpT","totalItems": 17,"totalFiles": 10,"downloaded": 10,"failed": 0}
All downloaded files are in the named key-value store. You can access them via:
- Apify Console: Storage > Key-value stores >
media-{datasetId} - API:
https://api.apify.com/v2/key-value-stores/{storeId}/records/{key}
Integration setup
The primary use case is running this Actor automatically after a Reel or Post Scraper run finishes. Set this up using Apify's built-in integrations.
Step-by-step
- Open your Instagram Reel Scraper or Instagram Post Scraper Actor (or Task) in Apify Console
- Go to the Integrations tab
- Click Add integration > Run Actor
- Configure:
- When: Run succeeded
- Actor: Select
instagram-media-downloader(this Actor) - Input:
{"datasetId": "{{resource.defaultDatasetId}}"}
- Click Save
The {{resource.defaultDatasetId}} template variable is automatically replaced with the dataset ID from the scraper run that triggered the integration.
Setting up on a Task (recommended)
If you run the scraper via a saved Task (e.g., "Scrape NatGeo reels daily"):
- Open the Task in Console
- Go to Integrations tab
- Follow the same steps above
This way, every scheduled or manual run of that Task will automatically trigger the media download.
Custom key-value store name
If you want a descriptive store name instead of the dataset ID:
{"datasetId": "{{resource.defaultDatasetId}}","keyValueStoreName": "natgeo-reels-{{resource.id}}"}
{{resource.id}} is the run ID of the source scraper.
Data retention
Downloaded media is stored in a named key-value store. Named stores in Apify are retained indefinitely (they don't expire with the run). However, storage counts toward your Apify plan limits.
To manage storage:
- Delete old key-value stores via Console or API when you no longer need the media
- Use the manifest (
_manifestkey) to audit what's stored before cleanup
Limitations
- URLs that have already expired will fail to download (the Actor logs a warning and continues)
- Very large datasets (1000+ items) may need higher memory allocation
- The Actor does not re-upload to external storage (S3, GCS, etc.) - files stay in Apify KVS