instagram Video Scraper and Downloader avatar
instagram Video Scraper and Downloader

Pricing

$8.75/month + usage

Go to Apify Store
instagram Video Scraper and Downloader

instagram Video Scraper and Downloader

πŸš€ Unlock Instagram content like never before! Scrape, download, & explore reels, posts & videos with AI-powered fallback, smart proxies, and hidden media links. Perfect for creators & researchers seeking full control & insights. πŸ”βœ¨"

Pricing

$8.75/month + usage

Rating

0.0

(0)

Developer

Neuro Scraper

Neuro Scraper

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

2 days ago

Last modified

Share

🌟 Instagram Video Scraper and Downloader

One-line hero: Instantly fetch metadata and download from public Reels or posts β€” production-ready, privacy-safe, and built for fast batch runs in Apify Console.


πŸ“– Short summary

This Actor extracts clean metadata for Instagram Reels/posts and (optionally) fetches downloadable media. It returns structured records to Dataset / Key-Value store and is designed for reliability, proxy-safety, and enterprise-scale runs.


πŸ’‘ Use cases β€” When to use

  • Bulk-collect metadata (title, author, upload date, views, likes) for analytics or feeds.
  • Attach best-effort download links and optionally download media for archival or processing.
  • Quick HTML metadata fallback when the primary scraper hits rate limits.
  • Privacy-sensitive workflows where raw media links should be redacted.

⚑ Quick Start (Console β€” one-click)

Hero screenshot (Console run):

(Add a screenshot/GIF of the Console run here for best conversion.)

One-liner: Paste a list of startUrls into the Input pane and click Run β€” results appear in Dataset and Key-Value Store in seconds.


βš™οΈ Quick Start (CLI + API)

CLI (one-liner)

$apify run --token=<APIFY_TOKEN>

Python (apify-client) β€” minimal example

from apify_client import ApifyClient
client = ApifyClient(token="<APIFY_TOKEN>")
run_input = {
"mode": "both",
"startUrls": [{"url": "https://www.instagram.com/reel/SHORTCODE/"}],
"desired_resolution": "1080p"
}
run = client.actor("your-username/your-actor").call(run_input=run_input)
print(run)

πŸ“ Inputs (fields & schema)

Console JSON input example (also saved as input.example.json):

{
"mode": "scrape",
"startUrls": [
{"url": "https://www.instagram.com/reel/SHORTCODE/"}
],
"desired_resolution": "1080p",
"download": false,
"merge_if_ffmpeg": false,
"cookie_file": "<COOKIE_FILE_STORE_KEY_OR_PATH>",
"hide_media_links": true,
"preserve_thumbnails": true,
"maxConcurrency": 3,
"preferred_proxy_type": "auto",
"diagnostic": false
}

Tip: The Platform can validate inputs with an input schema. Provide startUrls as an array of objects {"url": "..."} for the Console UI.


βš™οΈ Configuration (actor inputs)

πŸ”‘ NameπŸ“ Type❓ Requiredβš™οΈ DefaultπŸ“Œ Example🧠 Notes
modestringβœ… Yes"scrape""scrape" / "download"Choose what to run (metadata vs media)
startUrlsarrayβœ… YesNone[{"url":"https://..."}]List of target post/reel URLs
proxyConfigurationobjectβš™οΈ Optional{}{"useApifyProxy": true}Override actor proxy settings
preferred_proxy_typestringβš™οΈ Optional"auto""residential"Preferred proxy type for sessions
force_residentialbooleanβš™οΈ OptionalfalsetrueAlias to force residential proxy
downloadbooleanβš™οΈ OptionalfalsetrueWhether to download media files
desired_resolutionstringβš™οΈ Optional"1080p""720p"Preferred media resolution (UI: string)
merge_if_ffmpegbooleanβš™οΈ OptionalfalsetrueUse system merger to combine audio+video (optional)
cookie_filestringβš™οΈ OptionalNone"<COOKIE_FILE>"Cookie file key if authenticated access is needed
hide_media_linksbooleanβš™οΈ OptionaltruefalseRedact raw media URLs in output (privacy-safe)
preserve_thumbnailsbooleanβš™οΈ OptionaltruefalseIf false, thumbnails are redacted from output
maxConcurrencyintegerβš™οΈ Optional35Concurrency cap (1–10)
diagnosticbooleanβš™οΈ OptionalfalsetrueEnable verbose logs for debugging

Example Console setup: Paste https://www.instagram.com/reel/SHORTCODE/ into startUrls input and click Run Actor.


πŸ“„ Outputs (Dataset / KV examples)

Example output (one record)

{
"original_url": "https://www.instagram.com/reel/SHORTCODE/",
"id": "SHORTCODE",
"ownerUsername": "creator_handle",
"description": "Post caption text",
"likesCount": 1234,
"likesDisplay": "1.2k",
"commentsCount": 12,
"commentsDisplay": "12",
"videoViewCount": 45678,
"viewsDisplay": "45.7k",
"upload_date_iso": "2025-03-01T12:34:56Z",
"upload_date": "1st March 2025",
"thumbnail": "https://.../thumbnail.jpg",
"download_links": {"merged_video": "https://..."},
"_scraped_at": "2025-11-13T12:00:00Z",
"_source_index": 1
}

Notes: Records are written to Dataset (rows) and a full array is stored in Key-Value under key OUTPUT.


πŸ”‘ Environment Variables

  • APIFY_TOKEN β€” use in CLI / API calls. Use placeholder <APIFY_TOKEN> in examples.
  • HTTP_PROXY / HTTPS_PROXY β€” optional when providing a custom proxy like <PROXY_USER:PASS@HOST:PORT>.

⚠️ Always store credentials as Secrets in Console (do not paste plaintext into input fields).


▢️ How to Run (Console, CLI, API)

  1. Apify Console β€” open the Actor, paste startUrls JSON, choose mode, click Run.
  2. CLI β€” apify run --token=<APIFY_TOKEN> (ensure Actor is published or run from project folder).
  3. API / apify-client β€” call the Actor run endpoint with run_input JSON (see snippet above).

Quick checklist before running

  • Provide startUrls (required).
  • If you need consistent sessions, enable proxyConfiguration or set preferred_proxy_type.
  • Toggle hide_media_links to redact raw media URLs for privacy.

⏰ Scheduling & Webhooks

  • Schedule recurring runs from the Console (Runs β†’ Schedule) β€” pick frequency and input.
  • Webhooks: configure a webhook on successful run completion to get run payloads (Dataset / Key-Value links) for automation.

πŸ•ΎοΈ Logs & Troubleshooting

  • Check Run logs in Console for step-by-step messages.

  • Common issues:

    • No startUrls β€” actor exits early; supply startUrls array.
    • Rate limits / access errors β€” enable Proxy or try preferred_proxy_type: "residential".
    • Download fails β€” ensure download is enabled and proxy/cookie settings are correct.

Quick fixes: enable diagnostic: true for verbose logs, or reduce maxConcurrency to avoid bursts.


πŸ”’ Permissions & Storage Notes

  • Output storage: Dataset (records) and Key-Value (OUTPUT key) for full run JSON.
  • Privacy-first defaults: hide_media_links = true, preserve_thumbnails = true.
  • Do not store secrets in plain input β€” use Console Secrets or environment variables.

πŸ”Ÿ Changelog / Versioning (example)

  • v1.0.0 β€” Initial public release: metadata-first scraper, HTML fallback, optional downloader, privacy defaults.

πŸ–Œ Notes / TODOs

  • TODO: confirm output schema β€” inferred from the Actor but a formal schema.json will improve the Console UI.
  • TODO: add demo GIF/screenshots (provide images or Console screenshots for best conversion).

🌍 Proxy configuration

Enable Apify Proxy (quick): In Console β†’ Actor run Options β†’ toggle Use Apify proxy.

Custom proxy (example env vars):

export HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"
export HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"

Notes

  • Store proxy credentials as Console Secrets, not plaintext in inputs.
  • The Actor supports session-aware proxy URLs for consistent sessions.
  • TODO: Consider proxy rotation for large-scale scraping.

πŸ“š References (official docs)


πŸ€” What I inferred from main.py

  • Primary behavior: metadata-first scraper for public Reels/posts with an HTML fallback when the primary scraper is rate-limited.
  • Optional media extraction/download flow that selects best-resolution streams and can merge audio+video using a system merger when enabled.
  • Uses a proxy configuration (session-aware) and exposes flags to prefer residential proxies.
  • Outputs are written to Dataset and the Key-Value store under key OUTPUT.
  • Defaults are privacy-focused: hide_media_links: true, preserve_thumbnails: true, and maxConcurrency capped.

Why this Actor?

Quick benefits: production-ready, privacy-safe defaults, plug-and-play in Console, and robust fallback for stable metadata collection. Run it now β€” get instant insights in seconds.

Run this Actor on Apify Console β€” get results instantly. { "mode": "scrape", "startUrls": [ {"url": "https://www.instagram.com/reel/SHORTCODE/"} ], "desired_resolution": "1080p", "download": false, "merge_if_ffmpeg": false, "cookie_file": "<COOKIE_FILE_STORE_KEY_OR_PATH>", "hide_media_links": true, "preserve_thumbnails": true, "maxConcurrency": 3, "preferred_proxy_type": "auto", "diagnostic": false }

CONFIG.md β€” Advanced configuration & proxy notes

This optional config file explains advanced options and recommended Console setup for high-volume or sensitive runs.

Proxy & session hygiene

  • Prefer using the Actor's Proxy configuration option in Console (actor run Options) for session-aware URLs.
  • If you provide a custom proxy, store credentials as a Console Secret and reference them via environment variables or proxyConfiguration input.

Example env vars

HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"
HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"

Large-scale / reliability tips

  • Use preferred_proxy_type: "residential" for heavy runs when access errors occur.
  • Lower maxConcurrency to reduce bursts when you encounter rate limits.
  • Enable diagnostic: true to collect detailed logs for support triage.

Security & privacy

  • hide_media_links defaults to true β€” keep it enabled if you must not expose direct media URLs.
  • preserve_thumbnails defaults to true β€” set false to redact thumbnails as well.

TODOs

  • Add an INPUT_SCHEMA.json to the repo for Console UI form validation.
  • Add demo screenshots/GIFs to README for higher conversion.