Brainiall Speech MCP Server avatar

Brainiall Speech MCP Server

Pricing

from $20.00 / 1,000 tool calls

Go to Apify Store
Brainiall Speech MCP Server

Brainiall Speech MCP Server

Production speech AI tools for AI agents: Brainiall Pronunciation (0-100 phoneme/word/sentence scoring), Brainiall Speech (transcription with timestamps), Brainiall Speech Pro (99 languages + diarization), Brainiall Voice (12 voices). Sub-second p50.

Pricing

from $20.00 / 1,000 tool calls

Rating

0.0

(0)

Developer

Fabio Suizu

Fabio Suizu

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

20 days ago

Last modified

Categories

Share

Production speech AI tools for MCP-enabled AI agents: pronunciation scoring, speech-to-text, text-to-speech, and multilingual transcription — powered by the Brainiall Speech engines.

Tools

ToolDescription
assess_pronunciationBrainiall Pronunciation: score English pronunciation from audio (0-100 at overall, sentence, word, phoneme levels)
transcribe_audioBrainiall Speech: convert spoken English to text with word-level timestamps
synthesize_speechBrainiall Voice: generate natural speech from text (12 voices, American & British)
transcribe_audio_proBrainiall Speech Pro: 99 languages, speaker diarization
list_tts_voicesList available Brainiall Voice voices
check_pronunciation_serviceHealth check for the pronunciation engine
check_stt_serviceHealth check for the transcription engine
check_tts_serviceHealth check for the voice engine
check_whisper_serviceHealth check for the Speech Pro engine

Pronunciation Scoring

Returns scores (0-100) at four granularity levels:

LevelDescription
OverallGlobal pronunciation quality
SentenceSentence-level fluency and accuracy
WordPer-word pronunciation scores
PhonemeIndividual sound accuracy (IPA + ARPAbet)

Performance

  • Accuracy: Exceeds human inter-annotator agreement (PCC 0.590 vs 0.555)
  • Validated: 9,259 utterances across 7 L1 backgrounds, zero errors
  • Latency: Sub-second p50 for pronunciation and transcription

How to Use

Connect to this Actor in MCP server (standby) mode and call the tools above. All tools accept and return base64-encoded audio.

Example: Pronunciation Assessment

{
"audio_base64": "<base64-encoded-audio>",
"text": "The quick brown fox jumps over the lazy dog"
}

Example: Text-to-Speech

{
"text": "Hello, how are you today?",
"voice": "af_heart",
"speed": 1.0
}

Pricing

$0.02 per tool call (pay-per-event).

Technical Details

  • Engines: Brainiall Pronunciation, Brainiall Speech, Brainiall Speech Pro, Brainiall Voice
  • Audio: Supports WAV, MP3, OGG, FLAC, WebM
  • Backend: Brainiall production API (https://api.brainiall.com), auto-scaling

More from Brainiall

This MCP server covers the speech tools. The full Brainiall platform offers 19 specialty AI APIs + 5 bundles (NLP, vision, documents, fraud, authenticity and more) under one key.