📹Video to Text (YouTube, TikTok, Twitter/X, Ins...) avatar
📹Video to Text (YouTube, TikTok, Twitter/X, Ins...)

Pricing

Pay per event

Go to Apify Store
📹Video to Text (YouTube, TikTok, Twitter/X, Ins...)

📹Video to Text (YouTube, TikTok, Twitter/X, Ins...)

Developed by

NextAPI

NextAPI

Maintained by Community

Easily convert videos to text from over 1000 platforms, including YouTube, TikTok, Twitter/X, Instagram... Supports 12+ languages with translation options.

3.4 (3)

Pricing

Pay per event

3

41

41

Last modified

2 hours ago

Video to Text

A powerful AI-driven video transcription API that extracts audio from videos and converts speech to accurate text with automatic translation capabilities across 1000+ platforms.

This Actor streamlines video transcription by combining advanced audio extraction, AI speech recognition, and multi-language translation into a single, efficient API endpoint. Get precise transcripts with timestamps from any video content including YouTube, TikTok, Instagram, and hundreds of other platforms.

Key Features

  • Multi-Platform Support: Extract and transcribe audio from 1000+ video platforms including YouTube, TikTok, Instagram, Facebook, Twitter, and more
  • Advanced AI Transcription: High-accuracy speech-to-text conversion using Faster Whisper AI models with 99.8% accuracy rate
  • Multi-Language Processing: Automatic language detection with transcription in original language plus translation to 12 target languages
  • Precise Timestamps: SRT-format timestamps for each segment enabling perfect subtitle synchronization
  • Dual Output: Receive both original transcript in detected language and translated version in your target language
  • Enterprise-Grade Infrastructure: Smart proxy rotation, retry mechanisms, and concurrent processing for reliable extraction
  • Format Flexibility: Support for all major audio/video formats with automatic format conversion
  • Cost-Effective Processing: Volume-based pricing starting at $0.16/minute with significant savings over traditional services

Configuration

Required Parameters

ParameterTypeDescription
video_urlsarrayList of video URLs to transcribe (1-5 videos max)
target_langstringTarget language for translation output

Video URLs Examples:

Target Language Options:

  • Available languages: "English", "Chinese", "Japanese", "Korean", "Spanish", "French", "German", "Russian", "Portuguese", "Italian", "Arabic", "Hindi"
  • Default: "English"
  • If video is already in target language, both transcripts will be identical

Usage Example

{
"video_urls": [
"https://www.youtube.com/watch?v=WQNgQVRG9_U&ab_channel=Apify",
"https://www.tiktok.com/@openai/video/7368854068212583726"
],
"target_lang": "English"
}

Data Output

The Actor returns an array of transcription objects with comprehensive video metadata and dual-language transcripts. Output is available in multiple formats including JSON, CSV, Excel, and more.

Data Fields

Video Metadata

FieldTypeDescription
source_urlstringOriginal video URL that was processed
processorstringURL of the Apify actor that processed this video
processed_atISO dateISO formatted timestamp when the video was processed
platformstringVideo platform (YouTube, TikTok, Twitter, Instagram, etc.)
titlestringOriginal title of the video
descriptionstringVideo description
durationnumberVideo duration in seconds
authorstringVideo creator or uploader username/name
author_idstringAuthor's channel or user ID

Content Metrics

FieldTypeDescription
view_countintegerNumber of views on the video
like_countintegerNumber of likes on the video
shares_countintegerNumber of shares/reposts
dislike_countintegerNumber of dislikes on the video
comment_countintegerNumber of comments on the video
categoriesarrayVideo categories
tagsarrayVideo tags
published_atISO dateISO formatted timestamp when the video was published

Audio Information

FieldTypeDescription
audio_titlestringTrack name if the video contains music
audio_artiststringArtist name if the video contains music

Transcription Data

FieldTypeDescription
source_transcriptobjectOriginal transcribed text in detected language
target_transcriptobjectTranslated text in target language

Transcript Object Structure:

FieldTypeDescription
languagestringDetected language code
textstringFull transcribed text
segmentsarrayTime-segmented transcription

Segment Object Structure:

FieldTypeDescription
startstringSegment start time in SRT format (HH:MM:SS,mmm)
endstringSegment end time in SRT format (HH:MM:SS,mmm)
textstringTranscribed text for this segment

Example Output

{
"source_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"processor": "https://apify.com/marketingme/video-to-text",
"processed_at": "2025-01-15T10:30:45.000Z",
"platform": "YouTube",
"title": "Never Gonna Give You Up - Rick Astley",
"description": "The official video for Rick Astley's 1987 hit...",
"duration": 212,
"author": "Rick Astley",
"author_id": "RickAstleyVEVO",
"audio_title": "Never Gonna Give You Up",
"audio_artist": "Rick Astley",
"view_count": 1400000000,
"like_count": 15000000,
"shares_count": 850000,
"dislike_count": 180000,
"comment_count": 2500000,
"categories": ["Music", "Entertainment"],
"tags": ["rick astley", "never gonna give you up", "80s music"],
"published_at": "2009-10-25T07:57:33.000Z",
"source_transcript": {
"language": "English",
"text": "We're no strangers to love, you know the rules and so do I...",
"segments": [
{
"start": "00:00:00,000",
"end": "00:00:03,500",
"text": "We're no strangers to love"
},
{
"start": "00:00:03,500",
"end": "00:00:07,200",
"text": "You know the rules and so do I"
}
]
},
"target_transcript": {
"language": "Spanish",
"text": "No somos extraños al amor, conoces las reglas y yo también...",
"segments": [
{
"start": "00:00:00,000",
"end": "00:00:03,500",
"text": "No somos extraños al amor"
}
]
}
}

Supported Platforms

  • YouTube: World's largest video platform with full metadata extraction
  • TikTok: Short-form content with viral video processing
  • Instagram: Reels, IGTV, Stories, and live video content
  • Twitter/X: Video tweets and live streaming content
  • Facebook: Video posts, Stories, and live broadcasts
  • Reddit: Community video content and discussions
  • Twitch: Gaming streams and live content
  • LinkedIn: Professional video content
  • Vimeo: Creative and high-quality video content
  • And 990+ more platforms: Including educational, news, business, and international platforms

Supported Languages

This Actor supports automatic detection and transcription in the following 12 languages:

EnglishChineseJapaneseKoreanSpanishFrenchGermanRussianPortugueseItalianArabicHindi