Instagram Video Scraper and Downloader pro avatar
Instagram Video Scraper and Downloader pro

Pricing

$8.99/month + usage

Go to Apify Store
Instagram Video Scraper and Downloader pro

Instagram Video Scraper and Downloader pro

πŸš€ Lightning-fast Instagram Scraper & Downloader! Fetch videos, captions, hashtags & owner info in seconds. Privacy-safe βœ…, ready-to-use JSON & download links. Analytics has never been this effortless! πŸ”₯

Pricing

$8.99/month + usage

Rating

0.0

(0)

Developer

Neuro Scraper

Neuro Scraper

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

2 days ago

Last modified

Share

🌟 Instagram Video Scraper and Downloader Pro

One-liner (hero): Instantly fetch metadata and (optionally) download videos from public Instagram posts and reels β€” production-ready, privacy-safe, and plug‑and‑play on Apify Console.


πŸ“– Quick summary

This Actor returns normalized metadata for Instagram posts/reels (owner, captions, tags, views, duration, thumbnails, and downloadable links). Optionally it can fetch best-effort direct media download links or save video files. Runs in Apify Console with zero local setup.


πŸ’‘ Use cases / When to use

  • Extract post/reel metadata at scale for analytics, ingest into BI or search indices.
  • Generate preview datasets (thumbnails, captions, tags) for moderation or cataloging.
  • Produce downloadable video assets (best-effort) for archiving or processing pipelines.
  • Combine metadata + download for media QA workflows and automated ingestion.

⚑ Quick Start (Console β€” one-click)

  1. Open the Actor page in Apify Console.
  2. Paste a startUrls array (see example below) into the Input field.
  3. Click Run β€” results appear in the Dataset and Key-value store in seconds.

Runs in seconds for single URLs; scales with maxConcurrency for batches.


βš™οΈ Quick Start (CLI + API)

CLI (one-liner)

$apify run --input input.example.json

Python (apify-client) β€” compact example

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.actor('username/instagram-merged-actor').call(run_input={
'startUrls': ['https://www.instagram.com/reel/EXAMPLE/'],
'mode': 'both',
'desired_resolution': '1080p',
})
print('Started run:', run['id'])

πŸ“ Inputs (fields & schema)

Below is a concise Console JSON input example (also saved as input.example.json):

{
"mode": "both",
"startUrls": [
"https://www.instagram.com/reel/EXAMPLE/"
],
"desired_resolution": "1080p",
"download": false,
"merge_if_ffmpeg": false,
"hide_media_links": true,
"preserve_thumbnails": true,
"maxConcurrency": 3,
"diagnostic": false
}

Tip: Use the Actor input editor (generated from INPUT_SCHEMA.json) to provide values in a friendly UI.


βš™οΈ Configuration (Quick reference table)

πŸ”‘ NameπŸ“ Type❓ Requiredβš™οΈ DefaultπŸ“Œ Example🧠 Notes
modestringβœ… Yes"scrape""both"scrape = metadata only; download = media extraction; both = do both
startUrlsarrayβœ… Yes[]["https://.../reel/ID/"]List of URLs or { "url": "..." } objects
proxyConfigurationobjectβš™οΈ Optional{}{ "useApifyProxy": true }Use Console Proxy input or provide custom proxy JSON
preferred_proxy_typestringβš™οΈ Optional"auto""residential"residential forces residential groups if available
force_residentialbooleanβš™οΈ OptionalfalsetrueAlias for preferred residential mode
downloadbooleanβš™οΈ OptionalfalsetrueWhether to save video files (best-effort)
desired_resolutionstringβš™οΈ Optional"1080p""720p"Preferred download resolution (e.g. 1080p)
merge_if_ffmpegbooleanβš™οΈ OptionalfalsetrueIf true and ffmpeg is available, merges video+audio
cookie_filestringβš™οΈ OptionalNone"/tmp/cookies.txt"Use when authenticated views are required (store as secret)
hide_media_linksbooleanβš™οΈ OptionaltruefalseHides/Redacts raw media URLs in outputs
preserve_thumbnailsbooleanβš™οΈ OptionaltruefalseIf false, thumbnails are redacted in outputs
maxConcurrencyintegerβš™οΈ Optional35Number of parallel tasks (1–10)
diagnosticbooleanβš™οΈ OptionalfalsetrueMore verbose logs and raw debug fields
suppress_errorsbooleanβš™οΈ OptionaltruefalseWhen true the Actor suppresses noisy failures (recommended)

Example Console setup: paste a URL into startUrls, set mode to both and click Run.


πŸ“„ Outputs (Dataset / Key-value samples)

Example dataset item (metadata-first, download links redacted by default):

{
"original_url": "https://www.instagram.com/reel/EXAMPLE/",
"id": "EXAMPLE",
"ownerUsername": "creator",
"description": "Short caption text",
"likesCount": 1234,
"likesDisplay": "1.2k",
"commentsCount": 10,
"commentsDisplay": "10",
"videoViewCount": 54321,
"viewsDisplay": "54k",
"thumbnail": "https://.../thumbnail.jpg",
"download_links_summary": { "available": { "merged_video": true, "audio": false } },
"_scraped_at": "2025-11-13T12:00:00Z"
}

The Actor writes records to the Dataset and stores the full run array under Key-value store key OUTPUT.


πŸ”‘ Environment Variables

  • APIFY_TOKEN: Required when using CLI or apify-client. Use placeholder <APIFY_TOKEN> in examples.
  • HTTP_PROXY / HTTPS_PROXY: Use only for custom proxy setups. Example: http://<PROXY_USER:PASS@HOST:PORT>.
  • COOKIE_FILE: (Optional) path to cookies file when passing through the cookie_file input.

Always store credentials as secrets in Apify Console β€” do not place credentials directly in input JSON.


▢️ How to Run (Console, CLI, API)

Console

  • Paste input or use the generated input editor and click Run.

CLI

  • apify login && apify run --input input.example.json

Python / apify-client

  • See Quick Start snippet above; replace <APIFY_TOKEN> and actor ID accordingly.

⏰ Scheduling & Webhooks

  • Schedule runs in the Apify Console using the built-in scheduler (cron-style). Choose run frequency and input to automate.
  • For webhooks: configure a run webhook in Console to receive run finished events and point them to your endpoint. Use the run payload to fetch Dataset/KV.

πŸ•ΎοΈ Logs & Troubleshooting

  • Console β†’ Runs β†’ Select run β†’ Logs β€” primary place for errors and diagnostic messages.

  • Common issues & quick fixes:

    • No startUrls provided β€” add URLs to input (actor will exit with error saved to KV).
    • Proxy session failures β€” enable Apify Proxy in Console or provide proxyConfiguration (see Proxy Configuration below).
    • Missing download links β€” Downloader is best-effort; enable diagnostic to surface raw formats for debugging.

πŸ”’ Permissions & Storage Notes

  • Outputs: Dataset for records, Key-value store under key OUTPUT for full run dump. Files (when download is true) are saved to the actor filesystem during the run and may be pushed to external storage using post-run steps.
  • Privacy: By default the Actor redacts direct media links when hide_media_links is true and preserves thumbnails unless preserve_thumbnails is set to false.
  • Security: Store tokens and proxy credentials as Console secrets (do not commit into repo).

πŸ”Ÿ Changelog / Versioning

Use semantic versioning. Example format in your repo:

  • v1.0.0 β€” Initial production-ready release (metadata + downloader, proxy-safe flows)
  • v1.0.1 β€” Bugfix: improved HTML fallback detection

πŸ–Œ Notes / TODOs

  • TODO: confirm output schema β€” some fields are best-effort and may vary by source.
  • TODO: add demo GIF/screenshots (upload 5–7s run GIF to Console for best conversion).

🌍 Proxy Configuration

Quick single-line (Apify Proxy)

  • In Console: set Proxy configuration in the Actor Input β†’ choose Use Apify Proxy and pick groups (e.g., RESIDENTIAL).

Custom proxy example (as input)

  • Provide proxy JSON in proxyConfiguration input or set HTTP_PROXY/HTTPS_PROXY env vars.
export HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"
export HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"

Security reminder: store proxy credentials as secrets or use Apify Console proxy configuration.

Advanced: TODO: Consider proxy rotation for large-scale scraping.


πŸ“š References

  • Official: How to create an Actor README (Apify).
  • Official: Actor input schema specification (Apify).
  • Official: Apify API Client for Python docs.

πŸ€” What I inferred from main.py

  • The Actor accepts startUrls, supports mode (scrape|download|both) and a download flag.
  • It uses a metadata-first design with an HTML fallback for resilient scraping and optional media extraction.
  • Proxy configuration is supported and the Actor creates proxy sessions when available.
  • Outputs are written to Dataset and Key-value store key OUTPUT.

βœ… Why this Actor

  • Instant insights: get normalized metadata and download-links in seconds.
  • Production-ready: built for high-volume runs with sane defaults and error handling.
  • Privacy-safe: default redaction of raw media links and thumbnail-preservation controls.
  • Plug & play: run directly in Apify Console, CLI, or via apify-client.

Run this Actor on Apify Console β€” get instant results in seconds. { "mode": "both", "startUrls": [ "https://www.instagram.com/reel/EXAMPLE/" ], "desired_resolution": "1080p", "download": false, "merge_if_ffmpeg": false, "hide_media_links": true, "preserve_thumbnails": true, "maxConcurrency": 3, "diagnostic": false }

Configuration & Advanced Notes

This optional CONFIG provides extra guidance for operators and integrators.

Secrets & Storage

  • Apify token: use APIFY_TOKEN as a Console secret for CLI/API usage.
  • Proxy credentials: prefer Apify Proxy or Console secret storage β€” never hardcode credentials.
  • Cookie file: if using cookie_file, upload the cookie file as a secure secret/artifact and reference the path from input.

Files & Local artifacts

  • Downloaded files remain on the container filesystem for the duration of the run. If you need persistent storage, push files to external storage after the run.

Diagnostic mode

  • diagnostic = true enables verbose logs and saves raw format lists in outputs for debugging.

FFmpeg merging

  • Merging audio+video requires ffmpeg to be available in the runtime. Use merge_if_ffmpeg: true only when you expect separate audio/video streams and ffmpeg is present.

Proxy tips

  • For high-volume scraping use preferred_proxy_type = "residential" or configure Apify Proxy groups to reduce rate limits.
  • If you supply proxyConfiguration manually, ensure session IDs are set to stabilize session affinity.

Limitations & TODOs

  • This Actor is best for public posts and may not return private or gated content.
  • TODO: Add explicit INPUT_SCHEMA.json to expose UI-friendly validation.