TikTok & Instagram Reels Transcription — AI Captions avatar

TikTok & Instagram Reels Transcription — AI Captions

Pricing

Pay per usage

Go to Apify Store
TikTok & Instagram Reels Transcription — AI Captions

TikTok & Instagram Reels Transcription — AI Captions

Transcribe TikTok videos and Instagram Reels to text via automation. Get SRT captions for accessibility, subtitles for repurposing, and text content for scheduling tools. Batch multiple URLs. No Wisprs account needed.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Gitonga Mwaura

Gitonga Mwaura

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

11 hours ago

Last modified

Categories

Share

TikTok & Instagram Reels Transcription — AI Subtitles & Captions

Transcribe TikTok videos, Instagram Reels, YouTube Shorts, and Facebook videos to text, SRT, and VTT subtitle files. 100+ languages. Self-hosted Whisper AI — no OpenAI API key required. Saves to your Apify Dataset automatically.

This Actor uses the Wisprs API, which transcribes audio from short-form social media videos using Whisper-based speech-to-text. Unlike caption-scraping approaches that fail on videos without auto-generated captions, Wisprs transcribes the actual audio — which means it works for every public video. Accuracy is excellent on clear audio; results vary by language, accent, and recording quality.


What does this Actor do?

  1. Accepts a list of TikTok, Instagram Reel, YouTube Shorts, or Facebook video URLs
  2. Submits each to the Wisprs transcription API (short-form videos typically complete in 30–90 seconds)
  3. Exports the transcript in your chosen formats: TXT, SRT, VTT, JSON
  4. Optionally generates a one-sentence summary or Twitter thread from the transcript
  5. Saves one dataset row per video — ready for bulk processing, captioning, or content analysis

How do I use this Actor to transcribe TikTok videos?

Step 1 — Run the Actor

{
"startUrls": [
{ "url": "https://www.tiktok.com/@username/video/EXAMPLE" },
{ "url": "https://www.instagram.com/reel/EXAMPLE/" }
],
"language": "auto",
"exportFormats": ["txt", "srt", "vtt"]
}

Step 2 — Check your Dataset

Each video produces one dataset row with the full transcript and subtitle files.


What data does the Actor extract?

FieldDescription
urlThe submitted video URL
jobIdWisprs job identifier
transcriptionIdTranscription identifier
statuscompleted or failed
durationSecondsVideo duration in seconds
detectedLanguageDetected language ISO 639-1 code
transcript_txtFull plain-text transcript
transcript_srtSRT subtitle file content
transcript_vttWebVTT subtitle file content
transcript_jsonWord-level timestamps in JSON
repurposed_summary1–2 sentence summary (if repurposeMode=summary)
repurposed_threadTwitter/X thread array (if repurposeMode=thread)

How much will it cost to transcribe 1,000 TikTok videos?

Pricing is pay-per-event at $1.00 per 1,000 transcriptions ($0.001 per video):

  • Cheaper than the closest competitor ($1.50/1,000)
  • No per-minute surcharge for short-form video (most TikToks are under 3 minutes)

Example: 1,000 TikTok videos

  • Total: ~$1.00

Example: 10,000 Instagram Reels

  • Total: ~$10.00

The Apify free plan includes $5/month in credits — enough to transcribe 5,000 short-form videos.


Wisprs vs the competition

FeatureWisprstictechid Transcriber
TikTok + IG Reels + YT Shorts + FBYesYes
Language support100+35
Self-hosted AI (no external API key)YesUnspecified
SRT / VTT subtitle exportYesText only
Content repurposing (summary, thread)YesNo
Price per 1,000 transcriptions$1.00$1.50

What can I build with this?

Bulk caption generation — scrape TikTok or Instagram profiles for public videos, submit all URLs, and get SRT/VTT files ready to upload back to the platforms. Full caption coverage for a creator's entire library in one run.

Content moderation pipeline — transcribe flagged social media videos for text analysis. The transcript_txt field is ready to pipe into a content classifier or keyword filter.

Trend analysis — transcribe trending TikTok videos in a niche to extract what creators are saying. Identify recurring phrases, topics, and hooks at scale.

Multilingual caption localization — transcribe a video in its original language (with word-level timestamps via exportFormats: ["json"]), pass the JSON to a translation API, and re-align the translated text to the original timestamps. The timing structure carries through the pipeline.

Creator research tool — transcribe competitor content and run keyword analysis to understand their messaging strategy. Identify content gaps and talking points your audience responds to.

Social listening — transcribe public videos mentioning your brand, product, or keywords. Extract what customers are saying in video format that standard social listening tools miss.


Supported platforms

PlatformURL format
TikToktiktok.com/@username/video/ID
Instagram Reelsinstagram.com/reel/ID/
YouTube Shortsyoutube.com/shorts/ID
Facebook Reels/Watchfacebook.com/watch?v=ID
YouTube (standard)youtube.com/watch?v=ID

Public posts only. Private, followers-only, or age-restricted content cannot be transcribed.


Language support

100+ languages with automatic detection. The detected language appears as detectedLanguage in each dataset row. Notable languages: English, Spanish, Portuguese, Hindi, Indonesian, Arabic, French, German, Japanese, Korean, Mandarin, and 90+ more.

Short-form video presents unique transcription challenges (background music, cuts, fast speech). Accuracy is excellent on speech-forward content; results vary on heavily music-backed or ASMR content.



FAQ

Does this require an OpenAI API key or a Wisprs account? No. No external API key or account required. Wisprs runs Whisper on its own infrastructure and handles all transcription automatically — you pay only via Apify credits.

Does it work for videos without auto-generated captions? Yes. Wisprs transcribes the audio directly — it does not rely on platform-generated captions.

What about private or age-restricted videos? Private, followers-only, and age-restricted content cannot be downloaded. The dataset row will have status: "failed" and an errorMessage.

What's the maximum video length? This Actor is optimized for short-form content (under 10 minutes). For long-form video and podcasts, use the Wisprs Audio & Video Transcription Actor.

Can I process a full TikTok profile at once? Yes — pair this Actor with a TikTok scraper to extract all public video URLs from a profile, then pass them into this Actor.


Support


100+ languages. $1.00 per 1,000 videos. No account or API key required.