Twitter Video Transcript API – AI Video to Text for X (Twitter)
Pricing
from $1.00 / 1,000 results
Twitter Video Transcript API – AI Video to Text for X (Twitter)
Twitter Video Transcript API for converting video audio into accurate text using AI. Extract transcripts, spoken content, and metadata from X (Twitter) videos, tweets, and threads. Fast, reliable, and built for developers, AI agents, and automation workflows.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer
APISmith
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
3 days ago
Last modified
Categories
Share
Convert X (Twitter) video audio into accurate text using AI. Extract transcripts, spoken content, and metadata from videos, tweets, and threads. Fast, reliable, and built for developers, AI agents, and automation workflows.
What does Twitter Video Transcript API do?
Twitter Video Transcript API uses advanced AI to transcribe spoken audio from X (Twitter) videos into accurate text. Provide a video URL, get back:
- 📝 AI-generated transcript — word-by-word transcription with precise timestamps
- ⏱️ Segmented timing — time ranges for each speech segment
- 📊 Full metadata — tweet content, engagement stats, author info, video details
- ⚡ Instant results — transcription completes in seconds
Perfect for AI training data collection, content analysis pipelines, social listening, and automation workflows that need reliable access to Twitter video transcripts.
What can this API extract?
| Field | Type | Description |
|---|---|---|
url | string | Original X (Twitter) post URL |
title | string | Tweet text/content |
createTime | number | Unix timestamp of tweet creation |
img | string | Thumbnail image URL |
videoUrl | string | Direct MP4 video download link |
bookmarkCount | number | Number of bookmarks |
favoriteCount | number | Number of likes |
quoteCount | number | Number of quotes |
replyCount | number | Number of replies |
retweetCount | number | Number of retweets |
nickname | string | Author's display name |
screenName | string | Author's username |
avatarUri | string | Author's profile picture URL |
authorDesc | string | Author's bio/description |
authorFavouritesCount | number | Author's total likes |
authorFollowersCount | number | Author's follower count |
authorFriendsCount | number | Author's following count |
authorListedCount | number | Author's list count |
authorMediaCount | number | Author's media count |
authorStatusesCount | number | Author's tweet count |
duration | number | Video duration in seconds |
text | string | Complete AI transcript text |
segments | array | Transcript segments with timestamps |
timestamp | string | ISO 8601 timestamp of extraction |
errMsg | string | Error message (if any) |
Our Video Downloader & Transcript Suite
Need to extract videos or transcripts from other platforms? Check out our complete collection:
| Platform | Video Downloader | Transcript Extractor |
|---|---|---|
| TikTok | TikTok Video Downloader | TikTok Transcript Extractor |
| Instagram Video Scraper API | Instagram Transcript Extractor | |
| X (Twitter) | X Video Downloader | X Transcript Extractor ⭐ |
| Facebook Video Downloader | Facebook Transcript Extractor | |
| Douyin | Douyin Video Downloader | Douyin Transcript Extractor |
| Bilibili | Bilibili Video Downloader | Bilibili Transcript Extractor |
🌟 Special Offerings
- High Accuracy Douyin Transcripts Scraper — Premium quality transcription specifically optimized for Douyin (抖音) content with enhanced accuracy for Mandarin speech.
Quick Start
Option 1: No-Code (Apify Console)
The easiest way to get started without writing any code:
- Sign up for a free Apify account at console.apify.com
- Open Twitter Video Transcript API
- Paste your X (Twitter) video URL in the input field
- Click the "Start" button to run the transcript extraction
- Download your results in JSON, CSV, or Excel format from the Storage tab
Option 2: API Integration
For developers who want to integrate into their applications:
1. Get your API token
Sign up at Apify Console and copy your API token from the Integrations page.
2. Make a request
cURL:
curl -X POST 'https://api.apify.com/v2/acts/apple_yang/twitter-video-transcript-api/runs?token=YOUR_TOKEN' \-H 'Content-Type: application/json' \-d '{"videoUrl": "https://x.com/i/status/2040031089696162156"}'
JavaScript:
const { ApifyClient } = require('apify-client');const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('apple_yang/twitter-video-transcript-api').call({videoUrl: 'https://x.com/i/status/2040031089696162156'});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0].text); // Complete transcript
Python:
from apify_client import ApifyClientclient = ApifyClient('YOUR_TOKEN')run = client.actor('apple_yang/twitter-video-transcript-api').call(run_input={'videoUrl': 'https://x.com/i/status/2040031089696162156'})for item in client.dataset(run['defaultDatasetId']).iterate_items():print(item['text']) # Complete transcript
3. Download your data
Retrieve results via API or download from the Apify Console in JSON, CSV, or Excel format.
Input
| Parameter | Type | Required | Description |
|---|---|---|---|
videoUrl | string | ✅ Yes | X (Twitter) video URL to transcribe |
Input example
{"videoUrl": "https://x.com/i/status/2040031089696162156"}
Output
Results are returned as a JSON array. Download via API or export as JSON, CSV, or Excel.
Output example
[{"url": "https://x.com/i/status/2040031089696162156","title": "People don't read.\n\nThey scan.\n\nMost people scan in what's call an 'F' shape pattern.\n\nYour eyes go from 1 to 2 to 3 to 4 to 5.\n\nAnd, words don't have to be spelled in the correct order for your brain to know what it is.\n\nThat's because you don't read them.\n\nYou can scan them. https://t.co/mStX7IIGgz","createTime": 1775216288,"img": "https://pbs.twimg.com/ext_tw_video_thumb/2040031059455123456/pu/img/eATQIrbhqR-OU2Yi.jpg","videoUrl": "https://video.twimg.com/ext_tw_video/2040031059455123456/pu/vid/avc1/720x1280/KdRUI2A40sbho_h_.mp4?tag=12","bookmarkCount": 0,"favoriteCount": 0,"quoteCount": 0,"replyCount": 0,"retweetCount": 0,"nickname": "Growth Hackers 🚀","screenName": "StartGrowthHack","avatarUri": "https://pbs.twimg.com/profile_images/1303560724988395520/FT9pk2Om_normal.jpg","authorDesc": "Grow your business with https://t.co/S7ETrP8icC\nAward-Winning Growth Hacking Agency | Lead Generation | Customer Acquisition | AI-Powered Digital Marketing","authorFavouritesCount": 65099,"authorFollowersCount": 171030,"authorFriendsCount": 12841,"authorListedCount": 2446,"authorMediaCount": 30521,"authorStatusesCount": 171928,"duration": 21.461333,"text": "People don't read, they scan. Most people scan in what's called an F-shaped pattern. Your eyes go from 1 to 2 to 3 to 4 to 5. And words don't have to be spelled in the correct order for your brain to know what it is. That's because you don't read them, you scan them. So if you're making a website, make it scannable. If you're running ads, make them scannable because...","segments": [{"start": 0,"end": 1.78,"text": "People don't read, they scan."},{"start": 2,"end": 4.9,"text": "Most people scan in what's called an F-shaped pattern."},{"start": 5.22,"end": 8.16,"text": "Your eyes go from 1 to 2 to 3 to 4 to 5."},{"start": 8.38,"end": 12.54,"text": "And words don't have to be spelled in the correct order for your brain to know what it is."},{"start": 12.7,"end": 15.14,"text": "That's because you don't read them, you scan them."},{"start": 15.32,"end": 18.02,"text": "So if you're making a website, make it scannable."},{"start": 18.2,"end": 21.28,"text": "If you're running ads, make them scannable because..."}],"errMsg": "","timestamp": "2026-04-04T02:04:30.980Z"}]
How much does it cost?
Two-part pricing — pay for data extraction + AI transcription separately.
Part 1: Data Extraction (Fixed Rate)
Extract video metadata, author info, and engagement data from X (Twitter).
| Component | Price |
|---|---|
| Per result | $0.001 |
| Per 1,000 results | $1.00 |
💡 Note: This fee applies to every video processed, regardless of duration.
Part 2: AI Transcription (Tier-Based)
Convert video audio to text. Pricing varies by your Apify plan tier.
| Apify Tier | Price per Minute | Discount | Best For |
|---|---|---|---|
| Free | $0.0025 | — | Testing, small projects |
| Starter | $0.0020 | Bronze | Individual developers |
| Scale | $0.0015 | Silver | Growing teams |
| Business | $0.0010 | Gold | High-volume operations |
Billing Rules
| Rule | Description |
|---|---|
| Duration rounding | Charged per minute, rounded up (e.g., 1.5 min = 2 min) |
| Minimum charge | Videos under 1 minute billed as 1 minute |
| No audio, no charge | Videos without audio tracks are not charged for transcription |
| AI processing fee applies | Videos with audio are charged even if no speech is detected, because AI compute resources are still required to analyze and verify the absence of speech |
| Free plan credits | Includes $5/month in compute credits. At a rate of $0.0025 per minute, this covers up to ~2,000 minutes of video transcription per month (actual usage depends on rounding rules) |
Cost Example
Processing 1,000 videos (1 minute each):
| Cost Component | Free Tier | Business Tier |
|---|---|---|
| Data extraction (1,000 results) | $1.00 | $1.00 |
| AI transcription (1,000 min) | $2.50 | $1.00 |
| Total | $3.50 | $2.00 |
Use cases
🤖 AI & Machine Learning
- Build speech recognition models
- Train NLP systems on social media content
- Create multimodal datasets with video + transcript
📊 Social Listening
- Monitor brand mentions and sentiment
- Track trending topics and discussions
- Analyze competitor content strategies
📈 Content Analysis
- Extract insights from viral videos
- Identify key messaging patterns
- Build recommendation engines
🔄 Automation Pipelines
- Auto-transcribe creator content
- Archive tweet videos with full text
- Feed transcribed data into content management systems
🎯 Research & Analytics
- Academic research on social media discourse
- Market research based on video content
- Journalism and news gathering
Important notes
Video requirements
- Public content only: This API only works with publicly accessible X (Twitter) videos. Private tweets and protected accounts cannot be accessed.
- Speech required: Videos without spoken audio will return metadata but no transcript.
- Duration limits: Long videos may take additional processing time.
Transcript accuracy
AI transcription accuracy typically exceeds 95% for clear speech with good audio quality. Accuracy may vary based on:
- Background noise and music
- Multiple speakers or overlapping speech
- Accents and dialects
- Audio quality and bit rate
Rate limiting
Requests are processed through Apify's infrastructure with built-in resource management. Large batches may take longer to complete.
FAQ
Is it legal to transcribe X (Twitter) videos?
This API only transcribes publicly available content that users have chosen to share openly. We do not access private content or personal data.
However, you should be aware that:
- Results may contain personal data protected by GDPR and other regulations
- You should only transcribe data you have a legitimate reason to use
- Transcribed content should be used in compliance with copyright and platform policies
- When in doubt, consult legal counsel
Read more about web scraping legality.
What happens if a video has no speech?
If a video contains only music, sound effects, or no detectable speech, the API will return all metadata but the text and segments fields will be empty or minimal.
Can I transcribe threads?
Currently, this API processes individual video URLs. For threads with multiple videos, submit each video URL separately.
Do you support X (Twitter) Spaces audio?
This API is designed for video content. Spaces audio is not currently supported.
How long does transcription take?
Most videos are transcribed within 5-30 seconds, depending on duration. Shorter videos (< 1 minute) typically complete in under 10 seconds.
Do you offer custom solutions?
For high-volume needs, custom features, or enterprise integrations, contact us to discuss tailored solutions.
Support
We're committed to making your experience smooth and productive. Whether you're just getting started or running large-scale operations, we're here to help:
Get in touch for anything:
- 🐛 Found a bug? — Let us know what went wrong and we'll fix it quickly
- 💡 Need a feature? — Tell us what would make your workflow better
- ❓ Have questions? — Stuck on something? Ask us anything about using the API
- 🚀 Hit a limitation? — If our current setup doesn't quite fit your needs, let's talk
- 🏢 Enterprise needs? — High-volume requirements, custom integrations, or tailored solutions — we can build it
📧 Contact us: support@transcript365.com
We read every email and typically respond within 24 hours. Don't hesitate to reach out — your feedback helps us improve!
Built for developers who need reliable X (Twitter) video transcription without complexity.