AI Text-to-Speech Voiceover avatar

AI Text-to-Speech Voiceover

Pricing

from $40.00 / 1,000 voiceover generateds

Go to Apify Store
AI Text-to-Speech Voiceover

AI Text-to-Speech Voiceover

Turn text or a script into a natural AI voiceover audio file. Pick from multiple voices, speed, and format (MP3/WAV/Opus/AAC). Handles long scripts automatically. For faceless videos, narration, audiobooks, and podcasts. Bring your own OpenAI API key.

Pricing

from $40.00 / 1,000 voiceover generateds

Rating

5.0

(1)

Developer

Dami's Studio

Dami's Studio

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

3 days ago

Last modified

Share

Turns a block of text or a full script into a natural-sounding AI voiceover file. Pick a voice, set the speed, and choose MP3, WAV, Opus, or AAC. It's meant for the usual narration jobs: faceless videos, audiobooks, IVR prompts, explainer voiceovers.

How it works

The actor sends your text to an OpenAI-compatible TTS endpoint. Long scripts get split at sentence boundaries into chunks under ~3,500 characters, each chunk is synthesized separately, and the parts are stitched back into one file with ffmpeg (using stream copy, so there's no re-encode and no quality loss). Each finished audio file is saved to the run's key-value store and a row is pushed to the dataset.

Input

Nothing is strictly required by the schema, but in practice you need an openaiApiKey and at least one of text or texts. If neither is provided the run errors out.

FieldRequiredNotes
textone of text/textsThe script to voice, as a single string.
textsone of text/textsBatch mode. Array of strings, or objects keyed by script / scriptText / text / narration. One audio file per item.
voicenoalloy, echo, fable, onyx, nova, shimmer. Default onyx (deep male). nova and shimmer are female.
modelnotts-1 (fast, default) or tts-1-hd (higher quality, costs more on the OpenAI side).
formatnomp3 (default), wav, opus, or aac.
speednoPlayback speed from 0.25 to 4.0. Default 1.0. Values outside that range are clamped.
openaiApiKeyyes in practiceYour OpenAI key, used for the TTS call. Stored as a secret. Falls back to the OPENAI_API_KEY env var if set.
baseUrlnoAdvanced. Point at any OpenAI-compatible /audio/speech endpoint. Defaults to https://api.openai.com/v1.

Output

Each input item produces one audio file in the key-value store and one dataset record. The record includes audioKey and audioUrl (where to fetch the file), durationSeconds, characters, chunks (how many pieces the script was split into), plus the voice, model, and resolved format. Failed items get a record with ok: false and the error message instead of stopping the whole run.

Example

{
"text": "Welcome back to the channel. Today we're looking at one of the strangest mysteries of the deep ocean.",
"voice": "onyx",
"model": "tts-1",
"format": "mp3",
"speed": 1.0,
"openaiApiKey": "sk-..."
}

Pricing

$0.04 per voiceover, pay per result, no subscription. The OpenAI TTS usage is billed separately on your own key.

Notes

This actor calls OpenAI for synthesis, so it needs your own OpenAI API key. Individual chunks are capped at 4,000 characters before they're sent, which keeps each request within the model's per-call limit; there's no hard limit on total script length since long inputs are chunked and concatenated.