Instagram Dataset Media Downloader avatar

Instagram Dataset Media Downloader

Pricing

from $1.00 / 1,000 media

Go to Apify Store
Instagram Dataset Media Downloader

Instagram Dataset Media Downloader

Automatically downloads expiring media (videos and images) from Instagram Reel Scraper and Post Scraper datasets. Stores files in a named key-value store with full traceability back to the original reel or post.

Pricing

from $1.00 / 1,000 media

Rating

0.0

(0)

Developer

Vít Tuhý

Vít Tuhý

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

14 hours ago

Last modified

Share

Instagram Reels and Posts Media Downloader

Downloads expiring media (videos and images) from Instagram Reel Scraper and Instagram Post Scraper datasets before the URLs expire.

Instagram CDN links typically expire within 7 days. This Actor grabs the media and stores it permanently in an Apify key-value store, with full traceability back to the original content.

How it works

  1. Reads a dataset by ID (from a Reel or Post Scraper run)
  2. Auto-detects whether the dataset came from the Reel Scraper or Post Scraper
  3. Downloads all media files (videos, images, carousel children)
  4. Stores each file in a named key-value store linked to the source dataset
  5. Creates a _manifest JSON with metadata for every downloaded file

Source detection

SourceDetected byFields downloaded
Instagram Reel ScraperdownloadedVideo field presentdownloadedVideo (video)
Instagram Post ScraperdisplayUrl field presentdisplayUrl (image), videoUrl (video, when present), plus all childPosts media for carousel/sidecar posts

File naming in the key-value store

Every file is keyed by the Instagram shortcode so you can trace it back to the original content:

SourceMediaKey formatExample
Reel ScraperVideoreel-{shortCode}reel-CxYz123
Post ScraperImagepost-{shortCode}-imagepost-AbCd456-image
Post ScraperVideopost-{shortCode}-videopost-AbCd456-video
Post ScraperCarousel child imagepost-{childShortCode}-imagepost-EfGh789-image
Post ScraperCarousel child videopost-{childShortCode}-videopost-EfGh789-video

Manifest

A _manifest key is stored in the same key-value store. It contains a JSON array with metadata for every downloaded file:

[
{
"shortCode": "CxYz123",
"originalId": "3210987654321",
"ownerUsername": "natgeo",
"caption": "Amazing wildlife footage...",
"timestamp": "2026-06-20T12:00:00.000Z",
"postType": "Video",
"kvsKey": "reel-CxYz123",
"mediaType": "video",
"sourceUrl": "https://...",
"contentType": "video/mp4",
"sizeBytes": 5242880
}
]

For carousel children, the manifest also includes parentShortCode and childIndex.

Input

FieldTypeRequiredDefaultDescription
datasetIdStringYes-ID of the dataset to read media URLs from
keyValueStoreNameStringNomedia-{datasetId}Custom name for the output key-value store
concurrencyIntegerNo5Number of parallel downloads (1-20)

Example input

{
"datasetId": "aXgbg1XUeBakPyWpT",
"concurrency": 10
}

Output

The Actor pushes a summary object to its default dataset:

{
"source": "reel-scraper",
"sourceDatasetId": "aXgbg1XUeBakPyWpT",
"keyValueStoreName": "media-aXgbg1XUeBakPyWpT",
"totalItems": 17,
"totalFiles": 10,
"downloaded": 10,
"failed": 0
}

All downloaded files are in the named key-value store. You can access them via:

  • Apify Console: Storage > Key-value stores > media-{datasetId}
  • API: https://api.apify.com/v2/key-value-stores/{storeId}/records/{key}

Integration setup

The primary use case is running this Actor automatically after a Reel or Post Scraper run finishes. Set this up using Apify's built-in integrations.

Step-by-step

  1. Open your Instagram Reel Scraper or Instagram Post Scraper Actor (or Task) in Apify Console
  2. Go to the Integrations tab
  3. Click Add integration > Run Actor
  4. Configure:
    • When: Run succeeded
    • Actor: Select instagram-media-downloader (this Actor)
    • Input:
{
"datasetId": "{{resource.defaultDatasetId}}"
}
  1. Click Save

The {{resource.defaultDatasetId}} template variable is automatically replaced with the dataset ID from the scraper run that triggered the integration.

If you run the scraper via a saved Task (e.g., "Scrape NatGeo reels daily"):

  1. Open the Task in Console
  2. Go to Integrations tab
  3. Follow the same steps above

This way, every scheduled or manual run of that Task will automatically trigger the media download.

Custom key-value store name

If you want a descriptive store name instead of the dataset ID:

{
"datasetId": "{{resource.defaultDatasetId}}",
"keyValueStoreName": "natgeo-reels-{{resource.id}}"
}

{{resource.id}} is the run ID of the source scraper.

Data retention

Downloaded media is stored in a named key-value store. Named stores in Apify are retained indefinitely (they don't expire with the run). However, storage counts toward your Apify plan limits.

To manage storage:

  • Delete old key-value stores via Console or API when you no longer need the media
  • Use the manifest (_manifest key) to audit what's stored before cleanup

Limitations

  • URLs that have already expired will fail to download (the Actor logs a warning and continues)
  • Very large datasets (1000+ items) may need higher memory allocation
  • The Actor does not re-upload to external storage (S3, GCS, etc.) - files stay in Apify KVS