Pricing

Pay per event

Try for free

Go to Apify Store

Universal Speech to Text Transcriber

Try for free

Transcribe audio from videos stored on Google Drive, Dropbox, GitHub raw, OneDrive, Sound Cloud, iCloud, AWS S3, GCS, X, Tiktok and much more. Convert share links to direct downloads for fast, accurate transcripts with timestamps and easy API integration.

Pricing

Pay per event

Rating

5.0

(1)

Developer

TicTech

Actor stats

Bookmarked

182

Total users

Monthly active users

19 days ago

Last modified

Multi-Provider Audio Transcriber

Extract and transcribe audio from video and audio files across cloud storage, social platforms, and direct media links. Paste a public URL, get a timestamped transcript with automatic language detection.

What You Can Transcribe

Paste one public URL per run from any of the sources below.

Video & audio platforms

TikTok — public video links
SoundCloud — public tracks and podcasts (single track URLs only, not /sets/ playlists)
Vimeo — public videos
Twitter / X — public video posts
Facebook — public videos (facebook.com, fb.watch)
Twitch — public clips and VODs
Reddit — public video posts
Dailymotion — public videos
LinkedIn — public video posts

Many other public video/audio URLs are attempted automatically when a link looks downloadable.

Cloud storage

Google Drive, Dropbox, GitHub Raw, OneDrive / SharePoint, Box, iCloud Drive
AWS S3, Google Cloud Storage, Azure Blob, Backblaze B2

Share links are converted to direct downloads automatically where needed.

Direct media & CDN links

Public URLs that return the actual file (not an HTML page)
CDN links — CloudFront, Akamai, custom CDNs
Direct file URLs — .mp4, .mp3, .wav, .m4a, and other common formats
Presigned or signed URLs that point straight to media

{ "start_urls": "https://soundcloud.com/artist/track-name" }
{ "start_urls": "https://cdn.example.com/podcast/episode.mp3" }

Note: Only publicly accessible content is supported. Private, login-required, geo-blocked, or age-restricted media may fail.

Smart processing: Detects file type automatically — extracts audio from video or transcribes audio files directly.

Why Use This Tool?

Automate tedious work: No more manual transcription—get results in minutes, not hours.
Unlock hidden insights: Search, analyze, and repurpose your video and audio content for new opportunities.
Boost accessibility: Make your content inclusive for all audiences with accurate transcripts.
Enhance discoverability: Improve SEO and compliance with rich, timestamped text.
Stay ahead: Track trends, monitor competitors, and optimize your content strategy with ease.

Why Choose This Tool?

🔄 Multi-Provider Support: Cloud storage, social/video platforms, and direct CDN/media links
🎯 Dual File Processing: Handle both video and audio files with a single tool - no need for separate solutions
⚡ Smart Automation: Automatic link conversion, file type detection, and processing optimization
🛡️ Enterprise Reliability: Built-in retry logic, memory optimization, and robust error handling
🌍 Language Intelligence: Automatic language detection across 20+ languages
📊 Rich Output: Timestamped transcripts with metadata for comprehensive content analysis
💾 Memory Efficient: Optimized for cloud environments with automatic cleanup and resource management

Features

Multi-Provider Support: Cloud storage, video platforms (TikTok, Vimeo, etc.), and direct CDN links
Dual File Support: Handles both video and audio files seamlessly
Smart Processing: Automatically detects file type and processes accordingly
Automatic Link Conversion: Converts share links to direct download URLs automatically
Smart Download Strategy: Platform-specific optimization for each provider
Retry Logic: Robust download with automatic retry mechanisms
Outputs transcript with timestamps and detected language
Detects and outputs the dominant spoken language (see supported languages below)
Memory-optimized processing with automatic cleanup
File Size Limits: Maximum 3GB per file to ensure reliable processing and cost efficiency
Memory Requirements: Larger files require increased Apify run memory allocation (see Usage section for details)

Pricing

Base + Per-Second Model

This Actor uses a pay-per-event pricing model with a base charge plus per-second transcription:

Pricing Structure

actor start: $0.01 per Actor run (actor start cost)
transcription-second: $0.0025 per second of audio transcribed

Key Features

Actor start cost - $0.01 charged once per run
Per-second transcription - additional cost based on actual audio duration
Rounded up to the nearest second for transcription duration
No hidden fees or additional charges
Automatic event tracking by the Apify platform

Pricing Examples

Successful Transcription (30-second file)

actor start: $0.01
transcription-second × 30: $0.075
Total: $0.085

Short Video (2 seconds)

actor start: $0.01
transcription-second × 2: $0.005
Total: $0.015

Failed Transcription

actor start: $0.01
Total: $0.01

Long Video (5 minutes)

actor start: $0.01
transcription-second × 300: $0.75
Total: $0.76

Why This Pricing Model?

Fair: Base charge covers setup costs, per-second covers actual processing
Transparent: Clear base + usage pricing structure
Predictable: Easy to estimate costs before running
Cost-effective: Only pay for actual audio duration beyond base charge
Simple: Just two event types to track
Attempt events: Minimal cost for failed operations

Note: Pricing is based on actual processing events, so you only pay for what the Actor does. No hidden fees or minimum charges.

Supported Sources (reference)

The list below mirrors What You Can Transcribe above with a few extra format notes.

Cloud Storage

Google Drive – share links auto-converted to direct downloads
Dropbox – share links auto-converted to direct downloads
GitHub Raw – direct links to files in repositories (/raw/ URLs)
OneDrive & SharePoint – share links auto-converted to direct downloads
Box – share links auto-converted to direct downloads
iCloud Drive – share links auto-converted to direct downloads
AWS S3 – public buckets, objects, and presigned URLs
Google Cloud Storage – public objects and signed URLs
Azure Blob Storage – public blobs and SAS URLs
Backblaze B2 – public files and download URLs

Public posts and videos from these platforms (via yt-dlp):

TikTok – tiktok.com, vm.tiktok.com
Vimeo – public videos
Twitter / X – public video posts
Facebook – public videos (facebook.com, fb.watch)
Twitch – public clips and VODs
Reddit – public video posts
Dailymotion – public videos
SoundCloud – public single tracks (not /sets/ playlists)
LinkedIn – public video posts

Note: Only publicly accessible content is supported. Private, login-required, geo-blocked, or age-restricted media may fail.

Direct Media & CDN Links

Any direct file URL that returns the actual audio or video file over HTTP(S):

CDN links (e.g. CloudFront, Akamai, custom CDNs)
Signed or presigned URLs that point directly to .mp4, .mp3, .wav, etc.
Public hosting URLs where the response is the media file (not an HTML page)

Examples:

{ "start_urls": "https://cdn.example.com/media/episode.mp3" }
{ "start_urls": "https://d111111abcdef8.cloudfront.net/video.mp4" }

Supported File Types

Video Files

MP4, AVI, MOV, WMV, FLV, WebM, MKV, 3GP, and other common video formats
Audio extraction: Automatically extracts audio track for transcription
Optimized audio extraction: 16kHz mono output with 64k bitrate for minimal file size
Format detection: Automatically detects video format and handles accordingly

Audio Files

MP3, WAV, FLAC, M4A, AAC, OGG, WMA, AIFF, and other common audio formats
Direct transcription: Transcribes audio files without conversion
High-quality processing: Maintains original audio quality during transcription
Format detection: Automatically identifies audio format and handles accordingly
Longer duration support: Audio files are much smaller than video files, allowing transcription of longer content within the 3GB limit

Note: This tool intelligently processes both video and audio files. For videos, it extracts the audio track before transcription. For audio files, it transcribes directly for maximum efficiency.

💡 Audio vs Video: A 1-hour audio file (MP3) is typically 50-100MB, while a 1-hour video file can be 1-5GB. Use audio files for longer content to maximize transcription duration within the file size limit!

Use Cases

Content Creators: Generate captions, repurpose content, and analyze performance.
Marketers: Track brand mentions, analyze trends, and optimize strategy.
Researchers: Study video and audio content and extract insights from public discourse.
Businesses: Create training materials, monitor feedback, and maintain compliance records.
Developers: Transcribe documentation videos, tutorials, and demos stored in cloud platforms.
Podcasters: Transcribe audio episodes for accessibility and SEO.
Educators: Convert lecture recordings to searchable text.

Supported Languages

Spanish: es
English: en
Hindi: hi
Japanese: ja
Russian: ru
Ukrainian: uk
Swedish: sv
Chinese: zh
Portuguese: pt
Dutch: nl
Turkish: tr
French: fr
German: de
Indonesian: id
Korean: ko
Italian: it

Input Examples

Basic Input

{
  "start_urls": "https://drive.google.com/file/d/example/view"
}

Provider-Specific Examples

Google Drive

{
  "start_urls": "https://drive.google.com/file/d/1ABC123DEF456/view"
}

Dropbox

{
  "start_urls": "https://www.dropbox.com/s/abc123def456/video.mp4?dl=0"
}

GitHub Raw

{
  "start_urls": "https://raw.githubusercontent.com/username/repo/main/video.mp4"
}

OneDrive

{
  "start_urls": "https://1drv.ms/v/s!ABC123DEF456"
}

Box

{
  "start_urls": "https://app.box.com/s/abc123def456"
}

AWS S3

{
  "start_urls": "https://bucket-name.s3.amazonaws.com/video.mp4"
}

TikTok

{
  "start_urls": "https://www.tiktok.com/@user/video/VIDEO_ID"
}

Direct CDN / media file

{
  "start_urls": "https://cdn.example.com/files/podcast.mp3"
}

Output Example (Success)

{
  "sourceUrl": "https://drive.google.com/file/d/example/view",
  "videoId": "unique-video-id",
  "status": "success",
  "durationSec": 45.2,
  "transcript": "[0.00s - 2.50s] Welcome to my channel! [2.50s - 8.10s] Today I'm going to show you how to make the perfect pasta dish.",
  "detected_language": "en",
  "timestamp": "2024-01-01T12:00:00.000Z"
}

Output Example (Unsupported URL)

{
  "sourceUrl": "https://example.com/not-a-media-link",
  "videoId": "unique-video-id",
  "status": "success",
  "transcript": "The input link is not supported by this Actor. Please review the README for supported cloud storage services, video platforms, and direct media/CDN link formats, then submit a publicly accessible URL.",
  "durationSec": 0,
  "timestamp": "2024-01-01T12:00:00.000Z"
}

Output Fields

sourceUrl: Original file URL
videoId: Unique identifier for the file
status: "success" or "failed"
durationSec: File duration in seconds (only for success)
transcript: Transcription with timestamps (only for success) or error message (for failed outputs)
detected_language: BCP-47 code for the detected language (only for success)
timestamp: Processing time

Usage

Provide a single URL – cloud storage, video platform, or direct media/CDN link
Ensure the file is publicly accessible
Check file size - maximum 3GB per file
Set appropriate Apify run memory - larger files require more memory (see Memory Requirements below)
Generate transcript with automatic language detection
Receive timestamped transcript in JSON format

Note: Only one URL can be processed per actor run. For multiple files, run the actor separately for each URL.

Memory Requirements

Important: Larger files require more memory allocation. Please adjust the Apify platform's run memory setting based on your file size:

Small files (< 100MB): 128MB - 512MB memory is sufficient
Medium files (100MB - 1GB): 512MB - 2GB memory recommended
Large files (1GB - 2GB): 2GB - 4GB memory recommended
Very large files (2GB - 3GB): 4GB - 8GB memory required

⚠️ Memory Warning: If you encounter memory errors or timeouts when processing larger files, increase the Apify run memory setting in the actor configuration. The actor supports up to 8GB of memory. Bigger files use proportionally more memory during download, audio extraction, and transcription.

💡 Pro Tip: Audio files are much smaller than video files! If you have long audio content to transcribe, consider using audio files (MP3, WAV, etc.) instead of video files. This allows you to transcribe much longer durations while staying within the 3GB file size limit.

Technical Details

Unified Processing: Single interface handles all supported providers and file types
Intelligent Detection: Automatically identifies cloud storage provider and file format
Smart Link Conversion: Converts share links to direct download URLs for maximum compatibility
Robust Download: Built-in retry logic and error handling for reliable file processing
Optimized Processing:
- Videos: Efficient audio extraction with minimal quality loss
- Audio: Direct transcription without unnecessary conversion
Quality Assurance: Automatic duration validation and file size limits
Memory Management: Optimized for cloud environments with automatic cleanup
Security: No persistent data storage - files are processed and immediately deleted

Memory Optimization Features

Efficient Processing: Optimized for cloud environments with automatic cleanup
Chunked downloads (8KB chunks) to minimize memory usage
Automatic file cleanup after processing
Garbage collection at key processing stages
Memory monitoring throughout the process
Optimized audio extraction with minimal quality loss
Retry logic prevents memory waste from failed downloads

⚠️ Important: While the actor is optimized for memory efficiency, processing larger files (especially videos over 1GB) requires adequate memory allocation. Always increase the Apify run memory setting in the actor configuration when processing larger files to avoid out-of-memory errors. See the Memory Requirements section above for recommended memory allocations based on file size.

Example INPUT.json

{
  "start_urls": "https://drive.google.com/file/d/1ABC123DEF456/view"
}

Supported URL formats:

Google Drive: https://drive.google.com/file/d/FILE_ID/view
Dropbox: https://www.dropbox.com/s/FILE_ID/filename.mp4?dl=0
GitHub Raw: https://raw.githubusercontent.com/user/repo/main/video.mp4
OneDrive: https://1drv.ms/v/s!FILE_ID
Box: https://app.box.com/s/FILE_ID
TikTok: https://www.tiktok.com/@user/video/VIDEO_ID
Direct CDN: https://cdn.example.com/media/file.mp3

For more details, contact the maintainer or email us at contact@tictech.id

🎬 TikTok · Instagram · Facebook · YouTube Shorts Transcriber

simpleapi/tiktok-instagram-facebook-youtube-shorts-transcriber

🎬 Transcribe TikTok, Instagram, Facebook & YouTube Shorts instantly! 🗣️ Convert video audio to accurate text for captions, accessibility & content repurposing. ✅ Fast, reliable & easy to use. 🚀 Perfect for creators, marketers & teams.

SimpleAPI

Social Video Transcript Scraper - TikTok, Instagram & X

seemuapps/social-video-transcript-scraper

Extract accurate speech transcripts with timestamps from TikTok videos, Instagram Reels, and X videos - just paste the video URLs.

Andrew

Speech-to-Text Transcription

hgservices/speech-to-text

Transcribe audio and video from YouTube, TikTok, podcasts, X, and 1,000+ other sites or any direct media URL into accurate, speaker-labeled text. Uses World's best speech to text AI models with automatic language detection, multilingual support, and smart formatting.

Harish Garg

231

5.0

Youtube Video Downloader ✅ | No proxy needed

x_guru/youtube-video-downloader

Download YouTube videos and Shorts with original audio. No proxy needed. Save to Apify storage or your own cloud (AWS S3, Azure, Google Cloud).

Hundevmode Labs

Audio Transcriber — Speech to Text (Whisper)

viralanalyzer/audio-transcriber

Transcribe audio to text with Whisper running locally — no API key. Give any media URL (podcast, TikTok, Instagram, YouTube, or a direct audio file) and get text, timestamped segments, SRT subtitles and auto-detected language. Pay per minute transcribed.

viralanalyzer

Video & Audio Transcriber · Whisper Speech-to-Text

memo23/video-audio-transcriber

Transcribe any video or audio URL to text with Whisper running inside the Actor — no API key. TikTok, YouTube, Instagram, Facebook, X, Rumble, podcast RSS feeds & direct files. Full text, timestamped segments, SRT + VTT subtitles, 99+ languages auto-detected. One flat rate for video and audio.

Muhamed Didovic

Hugging Face Audio AI

alizarin_refrigerator-owner/hugging-face-audio-ai

Audio w/Hugging Face models speech recognition, text-to-speech & audio analysis Speech-to-Text: Transcribe audio Text-to-Speech: Generate natural speech Audio Classification: Classify sounds Voice Activity Detection: Detect speech Speaker Diarization: Identify speakers Music Generation: Create music

The Howlers

Dropbox Upload

petr_cermak/dropbox-upload

Automatically uploads URLs to Dropbox. Use an API to upload information to Dropbox from URLs, text content or base64.

Petr Cermak

TikTok | Instagram | Facebook | YouTube Shorts Transcriber

tictechid/anoxvanzi-Transcriber

Extract accurate transcripts from Instagram Reels, Facebook Reels, YouTube Shorts, and TikTok videos. Use video URLs to transcribe public content with timestamps. Export transcripts in JSON format, run via API, MCP, schedule runs, or integrate with other tools for automated transcription workflows.

TicTech

2.2K

5.0

🏁 TikTok Video Transcriber & Downloader +12 Languages

ingeniela/tiktok-video-transcriber

Download TikTok videos without watermark & get AI transcriptions with timestamps. Extract subtitles, captions & keywords. Multi-language speech-to-text converter. Direct download links included.

Ingeniela

Universal Speech to Text Transcriber

Multi-Provider Audio Transcriber

What You Can Transcribe

Video & audio platforms

Cloud storage

Direct media & CDN links

Why Use This Tool?

Why Choose This Tool?

Features

Pricing

Base + Per-Second Model

Pricing Structure

Key Features

Pricing Examples

Successful Transcription (30-second file)

Short Video (2 seconds)

Failed Transcription

Long Video (5 minutes)

Why This Pricing Model?

Supported Sources (reference)

Cloud Storage

Video & Social Platforms

Direct Media & CDN Links

Supported File Types

Video Files

Audio Files

Use Cases

Supported Languages

Input Examples

Basic Input

Provider-Specific Examples

Google Drive

Dropbox

GitHub Raw

OneDrive

Box

AWS S3

TikTok

Direct CDN / media file

Output Example (Success)

Output Example (Unsupported URL)

Output Fields

Usage

Memory Requirements

Technical Details

Memory Optimization Features

Example INPUT.json

You might also like

🎬 TikTok · Instagram · Facebook · YouTube Shorts Transcriber

Social Video Transcript Scraper - TikTok, Instagram & X

Speech-to-Text Transcription

Youtube Video Downloader ✅ | No proxy needed

Audio Transcriber — Speech to Text (Whisper)

Video & Audio Transcriber · Whisper Speech-to-Text

Hugging Face Audio AI

Dropbox Upload

TikTok | Instagram | Facebook | YouTube Shorts Transcriber

🏁 TikTok Video Transcriber & Downloader +12 Languages