Speech To Text avatar
Speech To Text
Under maintenance

Pricing

Pay per usage

Go to Apify Store
Speech To Text

Speech To Text

Under maintenance

Convert speech to text with high accuracy using Azure AI. Supports 100+ languages, speaker detection, and timestamps. Perfect for transcription, subtitles, and voice-to-text applications.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Fabio Suizu

Fabio Suizu

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

11 hours ago

Last modified

Categories

Share

Speech to Text - Audio Transcription

Convert audio files to text using AI-powered speech recognition. Supports multiple languages and engines.

Features

  • Fast Processing: Lightning-fast speech to text - audio transcription powered by Azure
  • Reliable: 99.9% uptime with automatic failover
  • Scalable: Handle single requests or bulk operations
  • Secure: Enterprise-grade security with API key authentication
  • Well Documented: Comprehensive API documentation and examples

Use Cases

  • Content Generation: Automate content creation workflows
  • Data Analysis: Extract insights from unstructured data
  • Automation: Integrate AI capabilities into your apps

Input Parameters

ParameterTypeRequiredDescription
audioUrlstringNoURL of the audio file to transcribe
audioBase64stringNoBase64-encoded audio data (alternative to URL)
languagestringNoLanguage code (e.g., 'en', 'es', 'fr'). Leave empty for auto
includeSegmentsbooleanNoInclude time-stamped segments in the response
enginestringNoSpeech recognition engine to use
detectLanguageOnlybooleanNoOnly detect the language without full transcription

Output Format

{
"success": true,
"result": { ... },
"timestamp": "2026-01-07T00:00:00Z"
}

Code Examples

JavaScript (Node.js)

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const input = {
"audioUrl": "https://example.com/audio.mp3",
"audioBase64": "example_audioBase64",
"language": "en",
"includeSegments": true,
"engine": "azure",
"detectLanguageOnly": false
};
const run = await client.actor("vivid_astronaut/speech-to-text").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run_input = {
"audioUrl": "https://example.com/audio.mp3",
"audioBase64": "example_audioBase64",
"language": "en",
"includeSegments": true,
"engine": "azure",
"detectLanguageOnly": false
}
run = client.actor("vivid_astronaut/speech-to-text").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

cURL

curl -X POST "https://api.apify.com/v2/acts/vivid_astronaut~speech-to-text/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"audioUrl": "https://example.com/audio.mp3",
"audioBase64": "example_audioBase64",
"language": "en",
"includeSegments": true,
"engine": "azure",
"detectLanguageOnly": false
}'

Pricing

Model: Pay per result Price: $0.020 per result

You only pay for successful results. Platform usage costs are included.

API Documentation

Full API documentation is available at:

Support

Version History

See ./CHANGELOG.md for version history.


Powered by Azure Cloud Infrastructure