Hugging Face Audio AI avatar

Hugging Face Audio AI

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Hugging Face Audio AI

Hugging Face Audio AI

Audio w/Hugging Face models speech recognition, text-to-speech & audio analysis Speech-to-Text: Transcribe audio Text-to-Speech: Generate natural speech Audio Classification: Classify sounds Voice Activity Detection: Detect speech Speaker Diarization: Identify speakers Music Generation: Create music

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

The Howlers

The Howlers

Maintained by Community

Actor stats

0

Bookmarked

18

Total users

5

Monthly active users

6 days ago

Last modified

Share

Hugging Face Audio

Audio processing with Hugging Face - speech recognition, text-to-speech, audio classification, and voice cloning.

BYOK (Bring Your Own Key) -- you provide your own API credentials.


Before You Start

This actor requires your own API credentials to fetch real data.

Where to get your key: Your Hugging Face API token

You can test with Demo Mode first (free, no key needed) to see the output format before committing.


Quick Start

Test with Demo Mode (free, no API key needed)

{
"demoMode": true,
"audioUrl": "https://example.com"
}

Run with real data

{
"demoMode": false,
"task": "speech_to_text",
"apiToken": "YOUR_API_KEY_HERE",
"audioUrl": "https://example.com",
"language": "en",
"voicePreset": "default",
"duration": 10,
"sampleRate": 22050,
"returnTimestamps": false,
"waitForModel": true
}

Input Parameters

ParameterTypeDefaultRequiredDescription
taskstring"speech_to_text"Yes*The audio operation to perform
apiTokenstring-Yes*Your Hugging Face API token
modelstring-NoHugging Face model ID (leave blank for default)
audioUrlstring-NoURL of input audio file
textstring-NoText to convert to speech
languagestring"en"NoLanguage code (e.g., en, es, fr, de)
targetLanguagestring-NoTarget language for translation
voicePresetstring"default"NoVoice style for TTS
speakerIdinteger-NoSpeaker voice ID for multi-speaker models
musicPromptstring-NoText description for music generation
durationnumber10NoDuration of generated audio
sampleRateinteger22050NoAudio sample rate in Hz
returnTimestampsbooleanfalseNoInclude word-level timestamps in transcription
waitForModelbooleantrueNoWait for model to load if not ready
webhookUrlstring-NoURL to receive results via webhook
demoModebooleantrueNoRun with sample data (no API calls)

*Required when Demo Mode is off.


Pricing

This actor uses pay-per-event billing:

EventDescriptionPrice
Audio ProcessedEach audio processing request completed$0.02

Demo mode is free -- no charges for sample data.


Troubleshooting

"API key is required"

You have Demo Mode turned off but didn't provide an API key. Either:

  • Turn Demo Mode on to test with sample data
  • Add your API key in the input

"API error 403" or "Unauthorized"

Your API key is invalid, expired, or doesn't have access to this specific API endpoint. Double-check your key and account permissions.

"API error 429" or "Rate limit"

Too many requests. Wait a minute and try again, or reduce the number of items per run.

No results or empty dataset

Check the run log for error messages. Common causes:

  • Invalid input format (check the examples above)
  • API key without proper permissions
  • The target data doesn't exist or is too small to track

How do I test without an API key?

Enable Demo Mode in the input. This returns realistic sample data so you can verify the output format works for your workflow.


Built by John Rippy | Actor Arsenal