Twitter / X Video Transcript Scraper avatar

Twitter / X Video Transcript Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Twitter / X Video Transcript Scraper

Twitter / X Video Transcript Scraper

Extract transcripts from Twitter/X video posts. Returns timestamped segments using native Twitter captions (WebVTT) with automatic Whisper AI fallback for uncaptioned videos

Pricing

from $3.00 / 1,000 results

Rating

5.0

(7)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

7

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Share

Extract full, timestamped transcripts from Twitter/X video posts — automatically using native Twitter captions (WebVTT) with Whisper AI speech-to-text as a fallback for uncaptioned videos.

Features

  • Native captions first — intercepts Twitter's built-in WebVTT subtitle tracks for fastest, most accurate results
  • Whisper AI fallback — uses faster-whisper to transcribe audio when no native captions are available
  • Timestamped segments — every output row includes startTime, endTime, and text for precise video navigation
  • Full transcript — each row also carries the complete joined transcript for easy search
  • Flexible method control — choose auto (native → Whisper), native only, or Whisper only
  • Multi-language support — native captions in any language; optional language hint for Whisper
  • Anti-detection — Playwright Firefox with stealth fingerprinting, randomised viewports/user-agents, and human-like delays

Input

FieldTypeRequiredDescription
postUrlsstring[]Twitter/X video post URLs (twitter.com or x.com both accepted)
cookiesstringTwitter/X session cookies JSON (auth_token + ct0 required)
transcriptionMethodselectauto (default), native, or whisper
whisperModelselecttiny, base (default), small, medium, large-v2
languagestringISO 639-1 hint for Whisper (e.g. en, es, fr)
proxyConfigurationobjectApify proxy settings

How to get Twitter cookies

  1. Log in to x.com in your browser
  2. Open DevTools → ApplicationCookieshttps://x.com
  3. Copy the auth_token and ct0 cookie values
  4. Export all cookies as JSON (e.g. using the EditThisCookie browser extension)
  5. Paste the JSON array into the cookies input field

Cookies expire periodically — re-export if you see expired_cookies errors.

Output

Each dataset row represents one transcript segment. Tweet metadata is repeated on every row for easy filtering.

FieldTypeDescription
tweetUrlstringCanonical x.com/…/status/… URL
tweetIdstringNumeric tweet ID
authorUsernamestringTwitter handle (without @)
authorNamestringDisplay name
tweetTextstringTweet caption / body text
publishedAtstringISO 8601 publish timestamp
languagestringISO 639-1 language code
transcriptMethodstringnative or whisper
transcriptAvailablebooleanfalse for tweets with no extractable transcript
segmentIndexinteger0-based position within the transcript
startTimefloatSegment start time in seconds
endTimefloatSegment end time in seconds
textstringSegment transcript text
fullTranscriptstringAll segments joined into one string
scrapedAtstringISO 8601 scrape timestamp

Sample output record

{
"tweetUrl": "https://x.com/NASA/status/1858131747319566780",
"tweetId": "1858131747319566780",
"authorUsername": "NASA",
"authorName": "NASA",
"tweetText": "Watch our latest discovery announcement…",
"publishedAt": "2024-11-17T18:30:00.000Z",
"language": "en",
"transcriptMethod": "native",
"transcriptAvailable": true,
"segmentIndex": 0,
"startTime": 0.0,
"endTime": 3.44,
"text": "We made a remarkable discovery this week",
"fullTranscript": "We made a remarkable discovery this week that changes our understanding of the solar system.",
"scrapedAt": "2025-01-15T10:22:33.456Z"
}

Transcription Methods

MethodWhen to useSpeedAccuracy
autoDefault — tries native first, Whisper fallbackFast when native availableHigh
nativeOnly want videos with Twitter captionsFastestHighest (verbatim)
whisperAll videos, including those without captionsSlowerHigh (model-dependent)

Whisper Model Selection

ModelSizeSpeedUse case
tiny32 MBFastestQuick drafts, high-volume runs
base74 MBFastDefault — good balance
small244 MBMediumBetter accuracy for accented speech
medium769 MBSlowHigh accuracy
large-v21550 MBSlowestBest quality, multiple languages

Memory Requirements for Long Videos (Whisper)

The actor automatically splits long audio into 10-minute chunks, so there is no video length limit. However, Whisper keeps the model and current chunk in RAM simultaneously:

Video lengthRecommended memory
Up to ~30 minutes2048 MB (default)
30 min – 2 hours4096 MB
2 hours+8192 MB

To set memory in the Apify UI: open your actor run → InputOptionsMemory. Native-caption runs have no meaningful memory requirement regardless of video length.

Limitations

  • Cookies required — Twitter restricts video access to authenticated sessions
  • Native captions availability — Not all Twitter videos have auto-generated captions; use whisper method for full coverage
  • Rate limits — Twitter may throttle rapid scraping; the actor applies human-like delays between requests
  • Proxy recommended — For high-volume runs, use Apify residential proxy to avoid IP bans

FAQ

Q: Why do I need cookies? Twitter requires authentication to serve video pages and caption tracks. Without cookies the actor cannot access video content.

Q: What if a video has no captions and I use method=native? The actor outputs a single row per tweet with transcriptAvailable: false and no segment fields. Switch to method=auto or method=whisper to use Whisper AI for those videos.

Q: Can I scrape multiple videos at once? Yes — add multiple URLs to postUrls. The actor processes them sequentially with delays to avoid rate limiting.

Q: Does this work with Twitter Spaces audio? No — Twitter Spaces use a different streaming format. This actor targets video posts only.

Q: How do I filter by language? All output rows include a language field. Use Apify's dataset filtering to select rows by language code.