Twitter Video Transcript API – AI Video to Text for X (Twitter) avatar

Twitter Video Transcript API – AI Video to Text for X (Twitter)

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Twitter Video Transcript API – AI Video to Text for X (Twitter)

Twitter Video Transcript API – AI Video to Text for X (Twitter)

Twitter Video Transcript API for converting video audio into accurate text using AI. Extract transcripts, spoken content, and metadata from X (Twitter) videos, tweets, and threads. Fast, reliable, and built for developers, AI agents, and automation workflows.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

APISmith

APISmith

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Share

Convert X (Twitter) video audio into accurate text using AI. Extract transcripts, spoken content, and metadata from videos, tweets, and threads. Fast, reliable, and built for developers, AI agents, and automation workflows.


What does Twitter Video Transcript API do?

Twitter Video Transcript API uses advanced AI to transcribe spoken audio from X (Twitter) videos into accurate text. Provide a video URL, get back:

  • 📝 AI-generated transcript — word-by-word transcription with precise timestamps
  • ⏱️ Segmented timing — time ranges for each speech segment
  • 📊 Full metadata — tweet content, engagement stats, author info, video details
  • Instant results — transcription completes in seconds

Perfect for AI training data collection, content analysis pipelines, social listening, and automation workflows that need reliable access to Twitter video transcripts.


What can this API extract?

FieldTypeDescription
urlstringOriginal X (Twitter) post URL
titlestringTweet text/content
createTimenumberUnix timestamp of tweet creation
imgstringThumbnail image URL
videoUrlstringDirect MP4 video download link
bookmarkCountnumberNumber of bookmarks
favoriteCountnumberNumber of likes
quoteCountnumberNumber of quotes
replyCountnumberNumber of replies
retweetCountnumberNumber of retweets
nicknamestringAuthor's display name
screenNamestringAuthor's username
avatarUristringAuthor's profile picture URL
authorDescstringAuthor's bio/description
authorFavouritesCountnumberAuthor's total likes
authorFollowersCountnumberAuthor's follower count
authorFriendsCountnumberAuthor's following count
authorListedCountnumberAuthor's list count
authorMediaCountnumberAuthor's media count
authorStatusesCountnumberAuthor's tweet count
durationnumberVideo duration in seconds
textstringComplete AI transcript text
segmentsarrayTranscript segments with timestamps
timestampstringISO 8601 timestamp of extraction
errMsgstringError message (if any)

Our Video Downloader & Transcript Suite

Need to extract videos or transcripts from other platforms? Check out our complete collection:

PlatformVideo DownloaderTranscript Extractor
TikTokTikTok Video DownloaderTikTok Transcript Extractor
InstagramInstagram Video Scraper APIInstagram Transcript Extractor
X (Twitter)X Video DownloaderX Transcript Extractor
FacebookFacebook Video DownloaderFacebook Transcript Extractor
DouyinDouyin Video DownloaderDouyin Transcript Extractor
BilibiliBilibili Video DownloaderBilibili Transcript Extractor

🌟 Special Offerings


Quick Start

Option 1: No-Code (Apify Console)

The easiest way to get started without writing any code:

  1. Sign up for a free Apify account at console.apify.com
  2. Open Twitter Video Transcript API
  3. Paste your X (Twitter) video URL in the input field
  4. Click the "Start" button to run the transcript extraction
  5. Download your results in JSON, CSV, or Excel format from the Storage tab

Option 2: API Integration

For developers who want to integrate into their applications:

1. Get your API token

Sign up at Apify Console and copy your API token from the Integrations page.

2. Make a request

cURL:

curl -X POST 'https://api.apify.com/v2/acts/apple_yang/twitter-video-transcript-api/runs?token=YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{
"videoUrl": "https://x.com/i/status/2040031089696162156"
}'

JavaScript:

const { ApifyClient } = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('apple_yang/twitter-video-transcript-api').call({
videoUrl: 'https://x.com/i/status/2040031089696162156'
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0].text); // Complete transcript

Python:

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('apple_yang/twitter-video-transcript-api').call(
run_input={'videoUrl': 'https://x.com/i/status/2040031089696162156'}
)
for item in client.dataset(run['defaultDatasetId']).iterate_items():
print(item['text']) # Complete transcript

3. Download your data

Retrieve results via API or download from the Apify Console in JSON, CSV, or Excel format.


Input

ParameterTypeRequiredDescription
videoUrlstring✅ YesX (Twitter) video URL to transcribe

Input example

{
"videoUrl": "https://x.com/i/status/2040031089696162156"
}

Output

Results are returned as a JSON array. Download via API or export as JSON, CSV, or Excel.

Output example

[
{
"url": "https://x.com/i/status/2040031089696162156",
"title": "People don't read.\n\nThey scan.\n\nMost people scan in what's call an 'F' shape pattern.\n\nYour eyes go from 1 to 2 to 3 to 4 to 5.\n\nAnd, words don't have to be spelled in the correct order for your brain to know what it is.\n\nThat's because you don't read them.\n\nYou can scan them. https://t.co/mStX7IIGgz",
"createTime": 1775216288,
"img": "https://pbs.twimg.com/ext_tw_video_thumb/2040031059455123456/pu/img/eATQIrbhqR-OU2Yi.jpg",
"videoUrl": "https://video.twimg.com/ext_tw_video/2040031059455123456/pu/vid/avc1/720x1280/KdRUI2A40sbho_h_.mp4?tag=12",
"bookmarkCount": 0,
"favoriteCount": 0,
"quoteCount": 0,
"replyCount": 0,
"retweetCount": 0,
"nickname": "Growth Hackers 🚀",
"screenName": "StartGrowthHack",
"avatarUri": "https://pbs.twimg.com/profile_images/1303560724988395520/FT9pk2Om_normal.jpg",
"authorDesc": "Grow your business with https://t.co/S7ETrP8icC\nAward-Winning Growth Hacking Agency | Lead Generation | Customer Acquisition | AI-Powered Digital Marketing",
"authorFavouritesCount": 65099,
"authorFollowersCount": 171030,
"authorFriendsCount": 12841,
"authorListedCount": 2446,
"authorMediaCount": 30521,
"authorStatusesCount": 171928,
"duration": 21.461333,
"text": "People don't read, they scan. Most people scan in what's called an F-shaped pattern. Your eyes go from 1 to 2 to 3 to 4 to 5. And words don't have to be spelled in the correct order for your brain to know what it is. That's because you don't read them, you scan them. So if you're making a website, make it scannable. If you're running ads, make them scannable because...",
"segments": [
{
"start": 0,
"end": 1.78,
"text": "People don't read, they scan."
},
{
"start": 2,
"end": 4.9,
"text": "Most people scan in what's called an F-shaped pattern."
},
{
"start": 5.22,
"end": 8.16,
"text": "Your eyes go from 1 to 2 to 3 to 4 to 5."
},
{
"start": 8.38,
"end": 12.54,
"text": "And words don't have to be spelled in the correct order for your brain to know what it is."
},
{
"start": 12.7,
"end": 15.14,
"text": "That's because you don't read them, you scan them."
},
{
"start": 15.32,
"end": 18.02,
"text": "So if you're making a website, make it scannable."
},
{
"start": 18.2,
"end": 21.28,
"text": "If you're running ads, make them scannable because..."
}
],
"errMsg": "",
"timestamp": "2026-04-04T02:04:30.980Z"
}
]

How much does it cost?

Two-part pricing — pay for data extraction + AI transcription separately.

Part 1: Data Extraction (Fixed Rate)

Extract video metadata, author info, and engagement data from X (Twitter).

ComponentPrice
Per result$0.001
Per 1,000 results$1.00

💡 Note: This fee applies to every video processed, regardless of duration.

Part 2: AI Transcription (Tier-Based)

Convert video audio to text. Pricing varies by your Apify plan tier.

Apify TierPrice per MinuteDiscountBest For
Free$0.0025Testing, small projects
Starter$0.0020BronzeIndividual developers
Scale$0.0015SilverGrowing teams
Business$0.0010GoldHigh-volume operations

Billing Rules

RuleDescription
Duration roundingCharged per minute, rounded up (e.g., 1.5 min = 2 min)
Minimum chargeVideos under 1 minute billed as 1 minute
No audio, no chargeVideos without audio tracks are not charged for transcription
AI processing fee appliesVideos with audio are charged even if no speech is detected, because AI compute resources are still required to analyze and verify the absence of speech
Free plan creditsIncludes $5/month in compute credits. At a rate of $0.0025 per minute, this covers up to ~2,000 minutes of video transcription per month (actual usage depends on rounding rules)

Cost Example

Processing 1,000 videos (1 minute each):

Cost ComponentFree TierBusiness Tier
Data extraction (1,000 results)$1.00$1.00
AI transcription (1,000 min)$2.50$1.00
Total$3.50$2.00

Use cases

🤖 AI & Machine Learning

  • Build speech recognition models
  • Train NLP systems on social media content
  • Create multimodal datasets with video + transcript

📊 Social Listening

  • Monitor brand mentions and sentiment
  • Track trending topics and discussions
  • Analyze competitor content strategies

📈 Content Analysis

  • Extract insights from viral videos
  • Identify key messaging patterns
  • Build recommendation engines

🔄 Automation Pipelines

  • Auto-transcribe creator content
  • Archive tweet videos with full text
  • Feed transcribed data into content management systems

🎯 Research & Analytics

  • Academic research on social media discourse
  • Market research based on video content
  • Journalism and news gathering

Important notes

Video requirements

  • Public content only: This API only works with publicly accessible X (Twitter) videos. Private tweets and protected accounts cannot be accessed.
  • Speech required: Videos without spoken audio will return metadata but no transcript.
  • Duration limits: Long videos may take additional processing time.

Transcript accuracy

AI transcription accuracy typically exceeds 95% for clear speech with good audio quality. Accuracy may vary based on:

  • Background noise and music
  • Multiple speakers or overlapping speech
  • Accents and dialects
  • Audio quality and bit rate

Rate limiting

Requests are processed through Apify's infrastructure with built-in resource management. Large batches may take longer to complete.


FAQ

This API only transcribes publicly available content that users have chosen to share openly. We do not access private content or personal data.

However, you should be aware that:

  • Results may contain personal data protected by GDPR and other regulations
  • You should only transcribe data you have a legitimate reason to use
  • Transcribed content should be used in compliance with copyright and platform policies
  • When in doubt, consult legal counsel

Read more about web scraping legality.

What happens if a video has no speech?

If a video contains only music, sound effects, or no detectable speech, the API will return all metadata but the text and segments fields will be empty or minimal.

Can I transcribe threads?

Currently, this API processes individual video URLs. For threads with multiple videos, submit each video URL separately.

Do you support X (Twitter) Spaces audio?

This API is designed for video content. Spaces audio is not currently supported.

How long does transcription take?

Most videos are transcribed within 5-30 seconds, depending on duration. Shorter videos (< 1 minute) typically complete in under 10 seconds.

Do you offer custom solutions?

For high-volume needs, custom features, or enterprise integrations, contact us to discuss tailored solutions.


Support

We're committed to making your experience smooth and productive. Whether you're just getting started or running large-scale operations, we're here to help:

Get in touch for anything:

  • 🐛 Found a bug? — Let us know what went wrong and we'll fix it quickly
  • 💡 Need a feature? — Tell us what would make your workflow better
  • Have questions? — Stuck on something? Ask us anything about using the API
  • 🚀 Hit a limitation? — If our current setup doesn't quite fit your needs, let's talk
  • 🏢 Enterprise needs? — High-volume requirements, custom integrations, or tailored solutions — we can build it

📧 Contact us: support@transcript365.com

We read every email and typically respond within 24 hours. Don't hesitate to reach out — your feedback helps us improve!


Built for developers who need reliable X (Twitter) video transcription without complexity.