Instagram Transcript Scraper avatar

Instagram Transcript Scraper

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Instagram Transcript Scraper

Instagram Transcript Scraper

Extract transcripts from Instagram videos and reels using auto-generated captions or AI-powered speech-to-text. Returns clean, timestamped transcript segments with full video metadata.

Pricing

from $5.00 / 1,000 results

Rating

5.0

(13)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

15

Bookmarked

18

Total users

11

Monthly active users

8 days ago

Last modified

Share

Turn Instagram videos and reels into clean, structured transcripts with ease.

Extract spoken content from public Instagram videos — designed for developers, data teams, and AI agents that need text, not just media.

Overview

The Instagram Transcript Scraper extracts transcripts from Instagram video posts and reels using a dual-strategy approach:

  1. Native Captions — Extracts Instagram's auto-generated captions when available (fastest, zero extra cost)
  2. Whisper AI — Falls back to OpenAI's Whisper speech-to-text model for accurate transcription of any video with speech

What You Get

For each Instagram video URL, the scraper returns:

  • Full transcript text (complete speech-to-text)
  • Timestamped segments for precise alignment
  • Video title & thumbnail
  • Video and audio playback URLs
  • Engagement metrics (likes, comments)
  • Creator profile information (username, display name, avatar)
  • Transcription method used (native or whisper)
  • Error status per request (safe for automation)

All data is returned in structured JSON, ready to feed into databases, AI pipelines, or downstream services.

Input Parameters

FieldTypeRequiredDescription
videoUrlsarrayYesList of Instagram video/reel URLs to transcribe
transcriptionMethodstringNoauto (default), native, or whisper
whisperModelstringNotiny, base (default), or small
languagestringNoLanguage code (e.g. en, es). Empty = auto-detect

Input Example

{
"videoUrls": [
"https://www.instagram.com/reel/DGtml1wNRoM/",
"https://www.instagram.com/p/DULBkEngpxg/"
],
"transcriptionMethod": "auto",
"whisperModel": "base"
}

Output Format

Each transcript segment is returned as a separate dataset row, with full video metadata included on every row for easy filtering and export.

FieldTypeDescription
urlstringNormalized Instagram video URL
codestringInstagram shortcode
pkstringInstagram internal post ID
idstringCombined media identifier
titlestringVideo caption text
imgstringVideo thumbnail URL
videoUrlstringDirect video playback URL
audioUrlstringDirect audio playback URL
createTimenumberVideo creation timestamp (Unix seconds)
likeCountnumberNumber of likes
commentCountnumberNumber of comments
userPkstringCreator user ID
userNamestringCreator username
userFullNamestringCreator display name
avatarUristringCreator avatar image URL
fullTextstringComplete transcript text
segmentIndexnumberSegment index (0-based)
segmentStartnumberSegment start time in seconds
segmentEndnumberSegment end time in seconds
segmentTextstringText for this specific segment
totalSegmentsnumberTotal number of segments for this video
transcriptionMethodstringMethod used: native or whisper
errMsgstringError message (empty if success)
timestampstringResponse timestamp (ISO 8601)

Output Example

{
"url": "https://www.instagram.com/p/DTxiM0Ijqvz/",
"code": "DTxiM0Ijqvz",
"pk": "3814980773552761843",
"id": "3814980773552761843_48143082417",
"title": "#explore",
"img": "https://scontent-...",
"videoUrl": "https://scontent-...",
"audioUrl": "https://scontent-...",
"createTime": 1769001195,
"likeCount": 171650,
"commentCount": 781,
"userPk": "48143082417",
"userName": "theavamariee",
"userFullName": "Ava Marie",
"avatarUri": "https://scontent-...",
"fullText": "Excuse me, miss. What? Your shirt's backwards.",
"transcriptionMethod": "whisper",
"timestamp": "2026-03-14T12:00:00.000Z",
"segmentIndex": 0,
"segmentStart": 0.3,
"segmentEnd": 1.16,
"segmentText": "Excuse me, miss.",
"totalSegments": 3,
"errMsg": ""
}

Typical Use Cases

  • AI agents & LLM pipelines — Speech-to-text ingestion for content understanding
  • Content summarization — Get the spoken word from any Instagram video
  • Social media analysis — Analyze what creators are saying at scale
  • Subtitle generation — Create captions for repurposing content
  • Trend & language research — Study speech patterns across video content
  • Batch transcription — Process hundreds of videos in one run

Transcription Methods

Tries native Instagram captions first. If not available, automatically falls back to Whisper AI. Best balance of speed and coverage.

Native

Only uses Instagram's auto-generated captions. Fastest and cheapest, but may not be available for all videos (especially older posts or non-Reels content).

Whisper

Always uses the Whisper AI speech-to-text model. Works for any video with speech. Choose model size based on your accuracy/speed needs:

ModelSizeSpeedAccuracyBest For
tiny39 MBFastestBasicQuick previews
base74 MBFastGoodMost use cases
small244 MBModerateVery goodHigh-accuracy needs

FAQ

Q: Does this work with Instagram Reels? A: Yes! Reels, regular video posts, and IGTV are all supported.

Q: What if the video has no speech? A: The scraper will return an empty transcript with an appropriate message. Music-only or silent videos produce no transcript.

Q: What languages are supported? A: Whisper supports 99+ languages with automatic detection. You can also specify a language code to improve accuracy.

Q: How accurate is the transcription? A: Native captions use Instagram's own speech recognition. Whisper base model provides good accuracy for most content. Use small for better results with accented or complex speech.

Q: Can I process videos in bulk? A: Yes! Provide multiple URLs in the videoUrls array. The scraper processes them sequentially with automatic delays between requests.

Notes

  • Transcript accuracy depends on audio clarity and language
  • Each transcript segment is a separate dataset row for easy export and filtering
  • The scraper uses browser automation with anti-detection measures for reliable access
  • Please ensure compliance with Instagram's terms and local regulations