
📹Video to Text (YouTube, TikTok, Twitter/X, Ins...)
Pricing
Pay per event

📹Video to Text (YouTube, TikTok, Twitter/X, Ins...)
Easily convert videos to text from over 1000 platforms, including YouTube, TikTok, Twitter/X, Instagram... Supports 12+ languages with translation options.
3.4 (3)
Pricing
Pay per event
3
57
20
Last modified
9 days ago
Transform any video into accurate text transcripts and translate them into 100+ languages instantly. Our AI-powered service extracts audio from videos across 1000+ platforms (YouTube, TikTok, Instagram, Twitter, Facebook, Vimeo, Twitch, and more), generates precise time-stamped transcripts, and provides high-quality translations - all with enterprise-grade reliability and lightning-fast processing.
🏆 Key Features
🎤 AI-Powered Transcription
- 🤖 Advanced AI Model - State-of-the-art speech recognition
- 🌍 Auto Language Detection - Automatically detects spoken language
- ⏱️ Time-Segmented Output - Precise timestamps for each segment
- 🎯 High Accuracy - Optimized for various accents and speaking styles
🌐 Multi-Language Translation
- 📝 100+ Languages - Support for major world languages
- 🔄 Smart Translation - Advanced translation service
- ⚡ Fast Processing - Efficient translation pipeline
- 🎯 Context-Aware - Maintains meaning and context
📊 Comprehensive Data Extraction
- 📹 Video Metadata - Title, description, duration, publish date
- 👤 Author Information - Creator details, channel URLs
- 📈 Engagement Metrics - Views, likes, comments, shares
- 🎵 Audio Details - Track titles, artists, audio quality
- 🖼️ Thumbnail - High-quality video thumbnail
💻 Input Parameters
Parameter | Type | Required | Description | Example |
---|---|---|---|---|
video_url | string | ✅ | Video URL from supported platforms | https://www.youtube.com/watch?v=2TK9tFZoBRg |
target_lang | string | ❌ | Target language for translation | "english" , "chinese" , "spanish" , "none" |
🌍 Supported Languages
The service supports 100+ languages including:
-
Major Languages: English, Chinese (Simplified/Traditional), Spanish, French, German, Japanese, Korean, Arabic, Hindi, Portuguese, Russian, Italian, Dutch, Swedish, Norwegian, Danish, Finnish, Polish, Czech, Hungarian, Romanian, Bulgarian, Greek, Turkish, Hebrew, Thai, Vietnamese, Indonesian, Malay, Filipino, Swahili, and many more.
-
Regional Languages: Including various African, Asian, European, and American indigenous languages.
Default Value: target_lang
defaults to "none"
(original language only) if not specified
📤 Output Structure
{"source_url": "https://www.youtube.com/watch?v=2TK9tFZoBRg","processor": "https://apify.com/nextapi/video-to-text","processed_at": "2024-01-15T10:30:00Z","platform": "Youtube","title": "Video Title","description": "Video description...","duration": 180,"published_at": "2024-01-01T00:00:00Z","author": "Channel Name","author_id": "UC123456789","categories": ["Entertainment", "Technology"],"tags": ["tutorial", "programming", "python"],"view_count": 1000000,"like_count": 50000,"dislike_count": 100,"shares_count": 5000,"comment_count": 2500,"audio_title": "Background Music","audio_artist": "Music Artist","thumbnail": "https://apify.com/kv-store/thumbnail.png","source_transcript": {"language": "English","text": "This is the full transcribed text from the video...","segments": [{"start": "00:00:00.000","end": "00:00:05.000","text": "This is the first segment of transcribed text."},{"start": "00:00:05.000","end": "00:00:10.000","text": "This is the second segment of transcribed text."}]},"target_transcript": {"language": "Hindi","text": "यह वीडियो से ट्रांसक्राइब किया गया पूरा हिंदी अनुवादित पाठ है...","segments": [{"start": "00:00:00.000","end": "00:00:05.000","text": "यह ट्रांसक्राइब किए गए पाठ के पहले खंड का हिंदी अनुवाद है।"},{"start": "00:00:05.000","end": "00:00:10.000","text": "यह ट्रांसक्राइब किए गए पाठ के दूसरे खंड का हिंदी अनुवाद है।"}]}}
📊 Output Fields Description
Field | Type | Description |
---|---|---|
source_url | string | Original video URL |
processor | string | Actor processor URL |
processed_at | string | ISO timestamp when processed |
platform | string | Source platform (YouTube, TikTok, etc.) |
title | string | Video title |
description | string | Video description |
duration | number | Duration in seconds |
published_at | string | Publication date (ISO format) |
author | string | Channel/creator name |
author_id | string | Channel/creator ID |
categories | array | Video categories |
tags | array | Video tags |
view_count | integer | View count |
like_count | integer | Like count |
dislike_count | integer | Dislike count |
shares_count | integer | Share count |
comment_count | integer | Comment count |
audio_title | string | Audio track title (if music present) |
audio_artist | string | Audio artist (if music present) |
thumbnail | string | Thumbnail URL |
source_transcript | object | Original transcription data |
source_transcript.language | string | Detected language name |
source_transcript.text | string | Full transcribed text |
source_transcript.segments | array | Time-segmented transcription |
target_transcript | object | Translated transcription data |
target_transcript.language | string | Target language name |
target_transcript.text | string | Full translated text |
target_transcript.segments | array | Time-segmented translation |
🎯 Use Cases
📊 Content Research & Analysis
- Generate transcripts for market research and competitive analysis
- Translate content for global market insights and localization
- Analyze spoken content across different languages and cultures
- Extract key information and insights from video content
- Create searchable databases from video interviews and surveys
🤖 Automation & Integration
- Batch process video collections for enterprise transcription
- Integrate with content management systems and workflows
- Automate subtitle generation for large video libraries
- Create searchable text databases from video content archives
- Streamline content localization and translation pipelines
📚 Educational & Training
- Generate study materials from educational videos and lectures
- Create multilingual learning resources for global students
- Extract key concepts from tutorials and training sessions
- Build accessible content for hearing-impaired and diverse learners
- Support online course creation and e-learning platforms
🎬 Media & Entertainment
- Generate professional subtitles for movies, TV shows, and documentaries
- Create multilingual versions of content for global distribution
- Extract dialogue and scripts for content analysis and adaptation
- Build searchable video libraries and content archives
- Support content creators with automated transcription services
💰 Pricing
Resource | Cost | Description |
---|---|---|
Actor Usage | $0.00001 | Base execution cost |
0-300s | $0.0054/s | $0.32 per minute |
300-900s | $0.0041/s | $0.25 per minute |
900-1800s | $0.0034/s | $0.20 per minute |
1800s+ | $0.0026/s | $0.16 per minute |
🔧 Technical Details
Supported Platforms
- ✅ YouTube - Videos, Shorts, Music, Live recordings, YouTube Ads
- ✅ TikTok - All public videos and content, TikTok Ads
- ✅ Instagram - Reels, IGTV, and video posts, Instagram Ads
- ✅ Twitter/X - Video tweets and spaces, Twitter Ads
- ✅ Facebook - Public videos and reels, Facebook Ads
- ✅ LinkedIn - Video posts and advertisements, LinkedIn Ads
- ✅ Pinterest - Video pins and ads, Pinterest Ads
- ✅ Snapchat - Public videos and stories, Snapchat Ads
- ✅ Google Ads - Video advertisements and display ads
- ✅ Amazon Ads - Video advertisements and sponsored content
- ✅ Vimeo - All public video content
- ✅ Twitch - Clips and recorded streams
- ✅ And 1000+ more - Any platform supported by our extraction engine
❓ FAQ
Q: What video platforms are supported?
A: We support 1000+ platforms including YouTube, TikTok, Instagram, Twitter, Facebook, Vimeo, Twitch, and many more through our advanced extraction system.
Q: How accurate is the transcription?
A: Our advanced AI model provides high accuracy transcription, especially for clear speech. Accuracy may vary with background noise, accents, or poor audio quality.
Q: How long does processing take?
A: Processing time depends on video length:
- Short videos (0-5 min): 30-60 seconds
- Medium videos (5-15 min): 1-3 minutes
- Long videos (15+ min): 3-10 minutes
Q: Can I transcribe videos in any language?
A: Yes! We support 100+ languages with automatic detection. You can also translate to any supported target language.
Q: What audio formats are processed?
A: We extract audio from any video format and convert it to WAV for optimal transcription quality.
Q: How can I speed up processing for large videos?
A: For faster processing of large videos, you can increase the Memory setting in Run options. Higher memory allocation will significantly improve processing speed for longer content.
🛠️ Troubleshooting
Common Issues
- ❌ Video download failed: Video may be private, deleted, or region-restricted
- ❌ Transcription failed: Video may have no speech or poor audio quality
- ❌ Budget exceeded: Task execution exceeds your Apify budget limit
- ❌ Processing timeout: Very long videos may take longer to process
🤝 Support & Community
- 📧 Support: Contact us
- 💬 Community: Telegram Group
On this page
Share Actor: