
Audio And Video Transcriber (OpenAI GPT-4o-transcribe)
Pricing
$5.00/month + usage

Audio And Video Transcriber (OpenAI GPT-4o-transcribe)
Downloads videos from public URLs, extracts audio, and transcribes them using OpenAI
5.0 (1)
Pricing
$5.00/month + usage
0
Total users
1
Monthly users
1
Runs succeeded
>99%
Last modified
4 days ago
Video Transcriber Actor 🎤🎬
This Apify Actor automates the process of downloading videos from public URLs, extracting their audio content, and then transcribing the audio into text using OpenAI's powerful speech-to-text models (GPT-4o Mini Transcribe or GPT-4o Transcribe).
Features
- Batch Video Processing: Provide a list of video URLs and get transcriptions for all of them.
- Powered by OpenAI: Utilizes state-of-the-art AI models (
gpt-4o-mini-transcribe
orgpt-4o-transcribe
) for accurate transcriptions. - Configurable Transcription: Adjust settings like language, prompt, and temperature to fine-tune transcription results.
- Robust Error Handling: Implements retries for network issues or temporary API failures.
- Parallel Processing: Downloads and transcribes multiple videos concurrently for faster results.
- Secure API Key Handling: Your OpenAI API key is treated as a secret input.
Use Cases
- Transcribing lectures, talks, or presentations.
- Generating subtitles or text content from video podcasts.
- Making video content searchable by transcribing its audio.
- Analyzing spoken content in a collection of videos.
Input Configuration
The actor requires the following input fields. Your OpenAI API key is essential for the transcription service to work.
Field | Type | Description | Default Value |
---|---|---|---|
video_urls | Array | Required. A list of public direct URLs to video files (e.g., MP4, MOV, AVI). Each URL will be processed. | [] (Example prefilled) |
openai_api_key | String | Required. Your OpenAI API key. This is treated as a secret and stored securely. | N/A |
openai_model | String | The OpenAI model for transcription. gpt-4o-mini-transcribe is fast & cost-effective; gpt-4o-transcribe may offer higher accuracy. | gpt-4o-mini-transcribe |
openai_transcription_language | String | Optional. Language of the audio in ISO-639-1 format (e.g., en for English). If omitted, OpenAI attempts auto-detection. | "" (Empty String) |
openai_transcription_prompt | String | Optional. Text prompt to guide the model's style or vocabulary (e.g., for specific jargon or names). | N/A |
openai_transcription_temperature | String | Sampling temperature (0.0-1.0, provided as a string e.g., "0.2" ). Lower values are more deterministic. | "0.0" |
max_concurrent_tasks | Integer | Maximum number of videos to process in parallel. | 5 |
max_retries | Integer | Number of times to retry processing a video if an error occurs. | 3 |
Example Input JSON:
{"video_urls": ["https://www.ffmpeg.org/example-assets/Counting_Atoms_preview.mp4","https://another-public-domain.com/another-video.mp4"],"openai_api_key": "sk-yourSecretOpenAiApiKeyGoesHere","openai_model": "gpt-4o-mini-transcribe","openai_transcription_language": "en","openai_transcription_prompt": "Focus on scientific terminology.","openai_transcription_temperature": "0.2","max_concurrent_tasks": 5,"max_retries": 3}
Output
The actor saves each transcription result as a separate item in the Apify Dataset. Each item will have the following structure:
{"download_url": "https://www.example.com/video.mp4","transcription": "This is the transcribed text from the video...","status": "succeeded" // or "failed"}
If a video fails to process after all retries, the transcription
will be null
, status
will be failed
, and an error
field will contain the error message.
How to Use
- Go to the Actor page on the Apify Store.
- Click on "Try actor".
- Fill in the input configuration fields, especially
video_urls
and youropenai_api_key
. - Click "Start" to run the actor.
- When the run finishes, you can find the results in the "Dataset" tab of the run console.
Technical Details
- The actor uses
ffmpeg
to extract audio from video files. Ensure the video formats are compatible with commonffmpeg
builds. - Video downloads are performed asynchronously.
- Transcription tasks are processed in parallel using Python's
multiprocessing
.
Limitations
- URL Accessibility: Video URLs must be publicly accessible and direct links to video files. Redirects are followed, but complex authentication or sites requiring browser interaction are not supported.
- OpenAI API Limits: Your OpenAI API usage is subject to your OpenAI account's rate limits and quotas. Long videos or large batches might take time or hit these limits.
- Video Size/Length: Extremely large video files might lead to increased processing time or memory usage. The actor downloads the entire video into memory before audio extraction.
- CDN Link Stability: If using temporary CDN links (e.g., from some social media platforms), they may expire. Prefer stable, direct URLs.
Support & Issues
If you encounter any issues or have suggestions for improvement, please open an issue on the GitHub repository for this actor (if applicable, or provide another contact method).
Happy Transcribing!