Pricing

from $20.00 / 1,000 tool calls

Brainiall Speech MCP Server

Production speech AI tools for AI agents: Brainiall Pronunciation (0-100 phoneme/word/sentence scoring), Brainiall Speech (transcription with timestamps), Brainiall Speech Pro (99 languages + diarization), Brainiall Voice (12 voices). Sub-second p50.

Pricing

from $20.00 / 1,000 tool calls

Rating

0.0

(0)

Developer

Fabio Suizu

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Tools

Tool	Description
assess_pronunciation	Brainiall Pronunciation: score English pronunciation from audio (0-100 at overall, sentence, word, phoneme levels)
transcribe_audio	Brainiall Speech: convert spoken English to text with word-level timestamps
synthesize_speech	Brainiall Voice: generate natural speech from text (12 voices, American & British)
transcribe_audio_pro	Brainiall Speech Pro: 99 languages, speaker diarization
list_tts_voices	List available Brainiall Voice voices
check_pronunciation_service	Health check for the pronunciation engine
check_stt_service	Health check for the transcription engine
check_tts_service	Health check for the voice engine
check_whisper_service	Health check for the Speech Pro engine

Pronunciation Scoring

Returns scores (0-100) at four granularity levels:

Level	Description
Overall	Global pronunciation quality
Sentence	Sentence-level fluency and accuracy
Word	Per-word pronunciation scores
Phoneme	Individual sound accuracy (IPA + ARPAbet)

Performance

Accuracy: Exceeds human inter-annotator agreement (PCC 0.590 vs 0.555)
Validated: 9,259 utterances across 7 L1 backgrounds, zero errors
Latency: Sub-second p50 for pronunciation and transcription

How to Use

Connect to this Actor in MCP server (standby) mode and call the tools above. All tools accept and return base64-encoded audio.

Example: Pronunciation Assessment

{
  "audio_base64": "<base64-encoded-audio>",
  "text": "The quick brown fox jumps over the lazy dog"
}

Example: Text-to-Speech

{
  "text": "Hello, how are you today?",
  "voice": "af_heart",
  "speed": 1.0
}

Pricing

$0.02 per tool call (pay-per-event).

Technical Details

Engines: Brainiall Pronunciation, Brainiall Speech, Brainiall Speech Pro, Brainiall Voice
Audio: Supports WAV, MP3, OGG, FLAC, WebM
Backend: Brainiall production API (https://api.brainiall.com), auto-scaling

More from Brainiall

This MCP server covers the speech tools. The full Brainiall platform offers 19 specialty AI APIs + 5 bundles (NLP, vision, documents, fraud, authenticity and more) under one key.

Platform: https://app.brainiall.com
Docs: https://app.brainiall.com/en/docs
Demo: https://huggingface.co/spaces/fabiosuizu/pronunciation-assessment
Company: https://www.brainiall.com

Speech AI MCP Server

vivid_astronaut/pronunciation-assessment-mcp

Speech AI MCP server with 9 tools: pronunciation scoring (0-100 at phoneme/word/sentence level), speech-to-text with timestamps, text-to-speech with 12 English voices, and multilingual Whisper transcription (99 languages + speaker diarization). Sub-300ms latency. Pay-per-use: $0.02/call.

Fabio Suizu

Hugging Face Audio AI

alizarin_refrigerator-owner/hugging-face-audio-ai

Audio w/Hugging Face models speech recognition, text-to-speech & audio analysis Speech-to-Text: Transcribe audio Text-to-Speech: Generate natural speech Audio Classification: Classify sounds Voice Activity Detection: Detect speech Speaker Diarization: Identify speakers Music Generation: Create music

The Howlers

Text To Speech

vivid_astronaut/text-to-speech

Convert text to natural speech using AI voices. Multiple voices and languages available. Generate audio files for podcasts, videos, accessibility, and voice assistants.

Fabio Suizu

AI Voice Generator MCP Server

szoni/apify-tts-mcp

Convert text to natural speech (text-to-speech / TTS) via MCP — multiple AI voices and models. Pay per character, no provider account or API key needed. Ready for Claude, Cursor and other AI agents.

Szoni

Text to speech generator

akash9078/advanced-text-to-speech

Professional-grade Text-to-Speech (TTS) actor powered by advanced AI models. Convert any text into natural, human-like speech with 50+ premium voices across 9 languages. Perfect for content creation, accessibility, voiceovers, audiobooks, podcasts, and multilingual applications.

Akash Kumar Naik

Text to Speech Generator

moving_beacon-owner1/my-actor-30

Convert text into natural-sounding speech in multiple languages with ease.

Jamshaid Arif

Speech Lang Pathologist Email Scraper

contacts-api/speech-lang-pathologist-email-scraper

Speech-language pathologist email scraper to extract verified speech therapist emails from clinics, hospitals, rehabilitation centers, schools, and healthcare directories 📧🗣️ Perfect for healthcare outreach, recruitment, and speech therapy lead generation.

Lead Heaven

Google Free Text to Speech

jupri/google-speech

Use free Google Text to Speech to translate text into voice

cat

302

Text To Speech

calm_necessity/text-to-speech

AI Text-to-Speech API that converts written text into high-quality natural voice audio. Supports multiple voices, languages, adjustable speed and pitch, ideal for audiobooks, podcasts, accessibility, automation, and voice-enabled applications.

Taher Ali Badnawarwala

Text to Speech

hgservices/text-to-speech

Turn any text into natural-sounding speech with AI voices in seconds. Powered by world class AI models, with multilingual voices and MP3, WAV, FLAC, Opus & AAC output. No setup or coding required.