
Universal Speech to Text Transcriber
Pricing
Pay per event

Universal Speech to Text Transcriber
Transcribe audio from videos stored on Google Drive, Dropbox, GitHub raw, OneDrive, Box, iCloud, AWS S3, GCS, Azure Blob, and Backblaze B2. Convert share links to direct downloads for fast, accurate transcripts with timestamps and easy API integration.
0.0 (0)
Pricing
Pay per event
2
2
2
Last modified
15 hours ago
Multi-Provider Audio Transcriber
Extract and transcribe audio from video and audio files stored across multiple cloud storage services! This powerful tool automatically converts share links to direct downloads, processes both video and audio files, and generates accurate, timestamped transcripts with automatic language detection. Unlock the full value of your media content for analysis, accessibility, and growth.
Note: Only publicly accessible files can be transcribed. Private or restricted files are not supported.
Smart Processing: Automatically detects file type and processes accordingly - extracts audio from videos or transcribes audio files directly.
Why Use This Tool?
- Automate tedious work: No more manual transcription—get results in minutes, not hours.
- Unlock hidden insights: Search, analyze, and repurpose your video and audio content for new opportunities.
- Boost accessibility: Make your content inclusive for all audiences with accurate transcripts.
- Enhance discoverability: Improve SEO and compliance with rich, timestamped text.
- Stay ahead: Track trends, monitor competitors, and optimize your content strategy with ease.
Why Choose This Tool?
- 🔄 Multi-Provider Support: Unlike single-platform tools, we support 10+ major cloud storage services
- 🎯 Dual File Processing: Handle both video and audio files with a single tool - no need for separate solutions
- ⚡ Smart Automation: Automatic link conversion, file type detection, and processing optimization
- 🛡️ Enterprise Reliability: Built-in retry logic, memory optimization, and robust error handling
- 🌍 Language Intelligence: Automatic language detection across 20+ languages
- 📊 Rich Output: Timestamped transcripts with metadata for comprehensive content analysis
- 💾 Memory Efficient: Optimized for cloud environments with automatic cleanup and resource management
Features
- Multi-Provider Support: Works with 10+ major cloud storage services
- Dual File Support: Handles both video and audio files seamlessly
- Smart Processing: Automatically detects file type and processes accordingly
- Automatic Link Conversion: Converts share links to direct download URLs automatically
- Smart Download Strategy: Platform-specific optimization for each provider
- Retry Logic: Robust download with automatic retry mechanisms
- Outputs transcript with timestamps and detected language
- Detects and outputs the dominant spoken language (see supported languages below)
- Memory-optimized processing with automatic cleanup
- File Size Limits: Maximum 500MB per file to ensure reliable processing
Pricing
Base + Per-Second Model
This Actor uses a pay-per-event pricing model with a base charge plus per-second transcription:
Pricing Structure
actor start
: $0.01 per Actor run (actor start cost)transcription-second
: $0.0025 per second of audio transcribed
Key Features
- Actor start cost - $0.01 charged once per run
- Per-second transcription - additional cost based on actual audio duration
- Rounded to the nearest second for transcription duration
- No hidden fees or additional charges
- Automatic event tracking by the Apify platform
Pricing Examples
Successful Transcription (30-second file)
actor start
: $0.01transcription-second
× 30: $0.075- Total: $0.085
Short Video (2 seconds)
actor start
: $0.01transcription-second
× 2: $0.005- Total: $0.015
Failed Transcription
actor start
: $0.01- Total: $0.01
Long Video (5 minutes)
actor start
: $0.01transcription-second
× 300: $0.75- Total: $0.76
Why This Pricing Model?
- Fair: Base charge covers setup costs, per-second covers actual processing
- Transparent: Clear base + usage pricing structure
- Predictable: Easy to estimate costs before running
- Cost-effective: Only pay for actual audio duration beyond base charge
- Simple: Just two event types to track
- Attempt events: Minimal cost for failed operations
Note: Pricing is based on actual processing events, so you only pay for what the Actor does. No hidden fees or minimum charges.
Supported Sources
Google Drive
- Automatically converts share links to direct download URLs
- Supports various share link formats:
https://drive.google.com/file/d/FILE_ID/view
https://drive.google.com/file/d/FILE_ID/edit
https://drive.google.com/open?id=FILE_ID
Dropbox
- Automatically converts share links to direct download URLs
GitHub Raw Content
- Direct links to raw video files in GitHub repositories
OneDrive & SharePoint
- Automatically converts share links to direct download URLs
- Supports OneDrive personal and SharePoint business accounts
Box
- Automatically converts share links to direct download URLs
iCloud Drive
- Automatically converts share links to direct download URLs
AWS S3
- Direct access to S3 buckets and objects
- Supports presigned URLs and public objects
Google Cloud Storage
- Direct access to GCS buckets and objects
Azure Blob Storage
- Direct access to Azure Blob containers and blobs
Backblaze B2
- Direct access to B2 buckets and files
Supported File Types
Video Files
- MP4, AVI, MOV, WMV, FLV, WebM, MKV, 3GP, and other common video formats
- Audio extraction: Automatically extracts audio track for transcription
- Optimized audio extraction: 16kHz mono output with 64k bitrate for minimal file size
- Format detection: Automatically detects video format and handles accordingly
Audio Files
- MP3, WAV, FLAC, M4A, AAC, OGG, WMA, AIFF, and other common audio formats
- Direct transcription: Transcribes audio files without conversion
- High-quality processing: Maintains original audio quality during transcription
- Format detection: Automatically identifies audio format and handles accordingly
- Longer duration support: Audio files are much smaller than video files, allowing transcription of longer content within the 500MB limit
Note: This tool intelligently processes both video and audio files. For videos, it extracts the audio track before transcription. For audio files, it transcribes directly for maximum efficiency.
💡 Audio vs Video: A 1-hour audio file (MP3) is typically 50-100MB, while a 1-hour video file can be 1-5GB. Use audio files for longer content to maximize transcription duration within the file size limit!
Use Cases
- Content Creators: Generate captions, repurpose content, and analyze performance.
- Marketers: Track brand mentions, analyze trends, and optimize strategy.
- Researchers: Study video and audio content and extract insights from public discourse.
- Businesses: Create training materials, monitor feedback, and maintain compliance records.
- Developers: Transcribe documentation videos, tutorials, and demos stored in cloud platforms.
- Podcasters: Transcribe audio episodes for accessibility and SEO.
- Educators: Convert lecture recordings to searchable text.
Supported Languages
- Spanish:
es
- English:
en
- Hindi:
hi
- Japanese:
ja
- Russian:
ru
- Ukrainian:
uk
- Swedish:
sv
- Chinese:
zh
- Portuguese:
pt
- Dutch:
nl
- Turkish:
tr
- French:
fr
- German:
de
- Indonesian:
id
- Korean:
ko
- Italian:
it
Input Examples
Basic Input
{"start_urls": "https://drive.google.com/file/d/example/view"}
Provider-Specific Examples
Google Drive
{"start_urls": "https://drive.google.com/file/d/1ABC123DEF456/view"}
Dropbox
{"start_urls": "https://www.dropbox.com/s/abc123def456/video.mp4?dl=0"}
GitHub Raw
{"start_urls": "https://raw.githubusercontent.com/username/repo/main/video.mp4"}
OneDrive
{"start_urls": "https://1drv.ms/v/s!ABC123DEF456"}
Box
{"start_urls": "https://app.box.com/s/abc123def456"}
AWS S3
{"start_urls": "https://bucket-name.s3.amazonaws.com/video.mp4"}
Output Example (Success)
{"sourceUrl": "https://drive.google.com/file/d/example/view","videoId": "unique-video-id","status": "success","durationSec": 45.2,"transcript": "[0.00s - 2.50s] Welcome to my channel! [2.50s - 8.10s] Today I'm going to show you how to make the perfect pasta dish.","detected_language": "en","timestamp": "2024-01-01T12:00:00.000Z"}
Output Example (Failed - Unsupported URL)
{"sourceUrl": "https://www.youtube.com/watch?v=example","videoId": "unique-video-id","status": "failed","transcript": "Unsupported URL type. Please provide a valid URL from supported cloud storage services.","durationSec": 0,"timestamp": "2024-01-01T12:00:00.000Z"}
Output Fields
- sourceUrl: Original file URL
- videoId: Unique identifier for the file
- status: "success" or "failed"
- durationSec: File duration in seconds (only for success)
- transcript: Transcription with timestamps (only for success) or error message (for failed outputs)
- detected_language: BCP-47 code for the detected language (only for success)
- timestamp: Processing time
Usage
- Provide a single file URL from any supported cloud storage services
- Ensure the file is publicly accessible
- Check file size - maximum 500MB per file
- Generate transcript with automatic language detection
- Receive timestamped transcript in JSON format
Note: Only one URL can be processed per actor run. For multiple files, run the actor separately for each URL.
💡 Pro Tip: Audio files are much smaller than video files! If you have long audio content to transcribe, consider using audio files (MP3, WAV, etc.) instead of video files. This allows you to transcribe much longer durations while staying within the 500MB file size limit.
Technical Details
- Unified Processing: Single interface handles all supported providers and file types
- Intelligent Detection: Automatically identifies cloud storage provider and file format
- Smart Link Conversion: Converts share links to direct download URLs for maximum compatibility
- Robust Download: Built-in retry logic and error handling for reliable file processing
- Optimized Processing:
- Videos: Efficient audio extraction with minimal quality loss
- Audio: Direct transcription without unnecessary conversion
- Quality Assurance: Automatic duration validation and file size limits
- Memory Management: Optimized for cloud environments with automatic cleanup
- Security: No persistent data storage - files are processed and immediately deleted
Memory Optimization Features
- Efficient Processing: Optimized for cloud environments with automatic cleanup
- Chunked downloads (8KB chunks) to minimize memory usage
- Automatic file cleanup after processing
- Garbage collection at key processing stages
- Memory monitoring throughout the process
- Optimized audio extraction with minimal quality loss
- Retry logic prevents memory waste from failed downloads
Example INPUT.json
{"start_urls": "https://drive.google.com/file/d/1ABC123DEF456/view"}
Supported URL formats:
- Google Drive:
https://drive.google.com/file/d/FILE_ID/view
- Dropbox:
https://www.dropbox.com/s/FILE_ID/filename.mp4?dl=0
- GitHub Raw:
https://raw.githubusercontent.com/user/repo/main/video.mp4
- OneDrive:
https://1drv.ms/v/s!FILE_ID
- Box:
https://app.box.com/s/FILE_ID
For more details, contact the maintainer or email us at contact@tictech.id