Snapchat Transcript Scraper avatar

Snapchat Transcript Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Snapchat Transcript Scraper

Snapchat Transcript Scraper

Extract transcripts from Snapchat Spotlight videos. Uses native WebVTT captions when available, with Whisper AI speech recognition as fallback. No login or cookies required.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Extract transcripts from any Snapchat Spotlight video — no login, no API key, no browser required. Automatically retrieves native WebVTT captions when Snapchat provides them, and falls back to Whisper AI speech recognition for videos without captions.

Perfect for content repurposing, accessibility workflows, market research, brand monitoring, and building subtitle databases from Snapchat Spotlight content.


What You Get

  • Native WebVTT transcript segments from Snapchat's embedded caption system (fast, zero AI cost)
  • Whisper AI fallback using OpenAI's speech recognition model when no native captions exist
  • Timestamps for every segment (startTime, endTime, text) in VTT format
  • Full concatenated transcript as a single string (fullText)
  • Video metadata: creator username, caption, duration, upload date, thumbnail URL
  • Language detection with confidence score (Whisper mode)
  • Works on public Spotlight videos — no Snapchat account required

Input

FieldTypeRequiredDefaultDescription
spotlightUrlsstring[]YesOne or more Snapchat Spotlight URLs. Supports snapchat.com/spotlight/..., profile-scoped snapchat.com/@user/spotlight/..., and short links t.snapchat.com/...
transcriptionMethodselectNoautoauto — native WebVTT first, Whisper fallback; native — native captions only (fast, no AI); whisper — always use Whisper AI
whisperModelselectNobaseWhisper model size: tiny (~30s/video), base (~60s/video, recommended), small (~120s/video, best for non-English)
languagestringNoOptional ISO 639-1 language hint for Whisper (e.g. en, ur, es, ar). Leave empty for auto-detection. Only applies when Whisper is used
proxyConfigurationobjectNoOptional Apify proxy configuration

Output

Each input URL produces one dataset record. Fields are omitted when not available.

FieldTypeDescription
snapIdstringSpotlight snap ID
inputUrlstringOriginal URL supplied as input
videoUrlstringDirect CDN video URL (time-limited)
thumbnailUrlstringVideo thumbnail URL
usernamestringCreator Snapchat username
captionstringVideo caption or description
durationSecondsnumberVideo duration in seconds
uploadedAtstringUpload timestamp (ISO 8601 UTC)
transcriptSourcestring"native" or "whisper"
transcriptAvailablebooleanWhether a usable transcript was extracted
transcriptUrlstringURL of the original WebVTT file (native only)
languagestringLanguage code (e.g. "en", "ur")
languageProbabilitynumberWhisper language confidence 0–1 (Whisper only)
whisperModelstringModel used: "tiny", "base", or "small" (Whisper only)
segmentsarrayTimed segments: [{startTime, endTime, text}]
segmentCountintegerNumber of transcript segments
fullTextstringFull transcript as a single string
scrapedAtstringScrape timestamp (ISO 8601 UTC)
errorstringError message if the URL failed

Example Output

{
"snapId": "W7_EDlXWTBiXAEEniNoMPwAAYaXFxdmVobmp6AZ499vl-AZ499rsXAAAAAQ",
"inputUrl": "https://www.snapchat.com/spotlight/W7_EDlXWTBiXAEEniNoMPwAAYaXFxdmVobmp6AZ499vl-AZ499rsXAAAAAQ",
"username": "brentrivera",
"durationSeconds": 60.0,
"transcriptSource": "native",
"transcriptAvailable": true,
"language": "en",
"segmentCount": 8,
"fullText": "Baby, I'm going to need 10 more minutes. Baby, it's been an hour...",
"segments": [
{"startTime": "00:00:00.000", "endTime": "00:00:02.500", "text": "Baby, I'm going to need 10 more minutes."},
{"startTime": "00:00:02.500", "endTime": "00:00:04.000", "text": "Baby, it's been an hour."}
],
"videoUrl": "https://cf-st.sc-cdn.net/...",
"thumbnailUrl": "https://bolt-gcdn.sc-cdn.net/...",
"uploadedAt": "2025-10-01T14:22:00+00:00",
"scrapedAt": "2026-06-28T09:45:00+00:00"
}

How Transcription Works

Native (fast, free)

Snapchat embeds WebVTT captions in the __NEXT_DATA__ JSON of most Spotlight pages. The scraper extracts them directly — no audio download, no AI processing, typically < 5 seconds per video.

Whisper AI (universal fallback)

When no native captions exist, the scraper:

  1. Downloads the video from Snapchat's CDN
  2. Extracts the audio track with ffmpeg (16 kHz mono MP3)
  3. Runs the audio through faster-whisper (CTranslate2 backend, CPU, int8)
  4. Returns timestamped segments with language detection

Whisper hallucination detection automatically discards output where language confidence is below 40% or the output is repetitive — the record will have transcriptAvailable: false with an explanatory error field.

Model Comparison

ModelSpeedQualityBest For
tiny~30s/videoLowerQuick drafts, high-volume batch
base~60s/videoGoodGeneral use (recommended)
small~120s/videoBetterNon-English, accented speech

FAQ

Does this require a Snapchat account or login? No. All data is extracted from public Spotlight pages without authentication.

Will every video have a transcript? Most popular Spotlight videos have native WebVTT captions. For videos without them, Whisper AI can transcribe the audio. Videos that are silent, mostly music, or very short may return transcriptAvailable: false.

How accurate are the transcripts? Native transcripts are Snapchat's own captions — very accurate for English speech. Whisper base model achieves ~95% word error rate for clear English audio. Use small for non-English or heavily accented content.

Can I get transcripts in other languages? Yes. Native transcripts are language-specific (Snapchat generates them per language). For Whisper, set language to the ISO 639-1 code (e.g. ur for Urdu, es for Spanish) for better accuracy, or leave it empty for auto-detection.

How do I handle rate limits? The scraper retries automatically on 429 and 5xx responses with exponential backoff. For large batches, Apify proxy is recommended.

What video formats does Whisper support? The scraper downloads the native Snapchat CDN video (typically H.264 MP4) and converts it to 16 kHz mono MP3 before transcription. Any video with an audio track will work.

How long does Whisper transcription take? Roughly 0.5–2× the video duration on CPU (the base model on a typical Snapchat Spotlight video of 15–60 seconds takes 15–60 seconds of processing time). Cloud actors run on 2–4 vCPU machines.


Other Snapchat Scrapers

Explore the full Snapchat scraper suite on Apify: