Auto Caption Video
Pricing
from $0.01 / 1,000 results
Auto Caption Video
Auto-generate TikTok & Reels-style captions for videos. AI-powered transcription creates animated word-by-word subtitles with customizable colors and fonts. Supports 50+ languages. Just paste a URL - get professional hardcoded captions. Ideal for social media content creators.
Pricing
from $0.01 / 1,000 results
Rating
5.0
(5)
Developer

Santhej Kallada
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Automatically add beautiful, animated TikTok/Reels-style subtitles to your videos using AI-powered transcription.
Features
- Accurate Transcription: AI-powered speech-to-text with word-level timestamps
- Animated Subtitles: Modern TikTok/Reels-style animations with word-by-word highlighting
- Professional Styling: Bold fonts, color highlighting, smooth pop-up animations
- Multiple Sources: Supports direct video URLs and Google Drive links
- Customizable: Adjust font size, colors, position, words per line, and more
- Fast Processing: Optimized ffmpeg settings for quality and speed
- Multiple Formats: Supports MP4, MOV, AVI, MKV, and WEBM videos
- 50+ Languages: Auto-detection or specify your language
How It Works
- Download: Fetches your video from a URL or Google Drive
- Extract: Extracts audio optimized for transcription
- Transcribe: AI generates text with precise word timings
- Generate: Creates animated ASS subtitles with professional styling
- Process: Burns subtitles into the video permanently
- Deliver: Returns a downloadable video with embedded subtitles
Pricing
Simple, Fixed Pricing: $0.17 per minute of video
| Video Length | Cost |
|---|---|
| 1 minute | $0.17 |
| 2 minutes | $0.34 |
| 3 minutes | $0.51 |
| 5 minutes | $0.85 |
All-inclusive pricing. No hidden fees or API costs.
Memory Requirements
Use at least 4GB RAM for optimal performance. Running with 1GB RAM will result in significantly slower processing times. The actor is configured to use 4GB by default.
Input Parameters
Required
- videoUrl (string): Direct URL to your video file or Google Drive share link
- Example:
https://example.com/video.mp4 - Example:
https://drive.google.com/file/d/{FILE_ID}/view
- Example:
Optional Styling
-
language (string): Language code for transcription (e.g., 'en', 'es', 'fr')
- Default: Auto-detect
- Improves accuracy if specified
-
fontSize (number): Base font size in pixels
- Default: 52
- Range: 20-100
-
fontColor (string): Primary font color in hex format
- Default: 'FFFFFF' (white)
- Example: 'FF0000' for red
-
highlightColor (string): Color for highlighted words in hex format
- Default: 'FFFF00' (yellow)
- Creates the TikTok-style word emphasis
-
wordsPerLine (number): Maximum words per subtitle line
- Default: 4
- Range: 1-10
- Lower values = shorter, punchier subtitles
-
position (string): Subtitle position on screen
- Options: 'top', 'center', 'bottom'
- Default: 'bottom'
Example Input
{"videoUrl": "https://example.com/my-video.mp4","language": "en","fontSize": 52,"fontColor": "FFFFFF","highlightColor": "FFFF00","wordsPerLine": 4,"position": "bottom"}
Output
The actor returns:
- outputUrl: Direct download link to your processed video
- videoInfo: Metadata about the original video
- transcriptionInfo: Details about the transcription (language detected, word count)
- status: Status updates throughout the process
Processing Time
| Video Duration | Estimated Time |
|---|---|
| 1 minute | ~2-3 minutes |
| 3 minutes | ~4-5 minutes |
| 5 minutes | ~6-8 minutes |
Processing time depends on video resolution, complexity, and server load
Limitations
- Maximum video size: 500MB
- Maximum duration: 5 minutes
- Supported formats: MP4, MOV, AVI, MKV, WEBM
- Audio quality: Must have clear speech for accurate transcription
Google Drive Setup
To use Google Drive links:
- Right-click your video in Google Drive
- Click "Get link"
- Set permissions to "Anyone with the link"
- Copy the link (should look like
https://drive.google.com/file/d/...) - Paste into
videoUrlinput
The actor will automatically convert it to a direct download link.
Tips for Best Results
Video Quality
- Use videos with clear audio and minimal background noise
- Higher resolution videos (1080p) look better but take longer to process
- Ensure good lighting and framing for professional results
Subtitle Styling
- White text with yellow highlights is most readable (default)
- Use 3-5 words per line for TikTok/Reels style
- Larger font sizes (50-60px) work better for mobile viewing
Language
- Specify the language code if you know it for better accuracy
- Supports 50+ languages
- Auto-detect works well but may be slightly less accurate
Troubleshooting
"Video file too large"
- Compress your video before uploading
- Maximum size is 500MB
"Video duration exceeds maximum"
- Maximum duration is 5 minutes
- Trim your video before processing
"Transcription failed"
- Verify the video has audible speech
- Check audio quality is clear
"Download failed"
- Check the video URL is accessible
- For Google Drive, ensure link sharing is enabled
- Try downloading the video directly to test
"Invalid video format"
- Convert your video to MP4, MOV, AVI, MKV, or WEBM
- Use a tool like HandBrake or ffmpeg to convert
Technical Details
Transcription
- Accuracy: Word-level timestamps for precise synchronization
- Languages: 50+ supported with auto-detection
Subtitle Format
- Format: ASS (Advanced SubStation Alpha)
- Features: Supports animations, colors, and effects
- Style: TikTok/Reels-inspired with word-by-word highlighting
Video Processing
- Codec: H.264 (libx264) for wide compatibility
- Quality: CRF 23 (high quality)
- Audio: AAC codec at 192kbps
- Subtitles: Permanently burned into video (hard subtitles)
Support
For issues, questions, or feature requests:
- Check the troubleshooting section above
- Review your input parameters
License
MIT
Version
1.0.0