Speech AI MCP Server
Pricing
Pay per usage
Go to Apify Store

Speech AI MCP Server
AI-powered speech tools for MCP agents: pronunciation scoring (0-100 at phoneme/word/sentence level), speech-to-text with word timestamps, and text-to-speech with 12 English voices. Sub-300ms latency.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Fabio Suizu
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Pronunciation Assessment MCP Server
AI-powered English pronunciation scoring for MCP-enabled AI agents.
What it does
This MCP server provides pronunciation assessment tools that can be called by AI agents:
- assess_pronunciation: Score pronunciation from audio (WAV/MP3/OGG/WebM, base64-encoded)
- check_pronunciation_service: Health check for the backend service
Scoring Levels
Returns scores (0-100) at four granularity levels:
| Level | Description | Example Use |
|---|---|---|
| Overall | Global pronunciation quality | Quick assessment |
| Sentence | Sentence-level fluency & accuracy | Feedback on flow |
| Word | Per-word pronunciation scores | Identify problem words |
| Phoneme | Individual sound accuracy | Detailed correction |
Performance
- Accuracy: Exceeds human inter-annotator agreement (PCC 0.576 vs 0.555 on phoneme scoring)
- Validated: 9,259 utterances across 7 native language backgrounds, zero errors
- Latency: p50=257ms, p95=423ms
How to Use
With MCP Client
Connect to the MCP endpoint:
https://Ym2gS88TksnTdTcPq.apify.actor/mcp?token=YOUR_APIFY_TOKEN
Tool: assess_pronunciation
Input:
{"audio_base64": "<base64-encoded-audio>","text": "The quick brown fox jumps over the lazy dog"}
Output:
{"overall_score": 72.5,"sentence_score": 75.0,"words": [{"word": "The", "score": 85.0, "phonemes": [...]},...]}
Tool: check_pronunciation_service
Returns service health status, model version, and size.
Pricing
$0.02 per assessment (pay-per-event).
Technical Details
- Model: Conformer-CTC Small (17MB, INT8 quantized)
- Audio: 16kHz mono, supports WAV/MP3/OGG/WebM/M4A
- Backend: Azure Container Apps, auto-scaling
Links
- Demo: https://huggingface.co/spaces/fabiosuizu/pronunciation-assessment
- API Docs: https://apim-ai-apis.azure-api.net/pronunciation/docs
- Company: Brainiall Inc