Sonartext Speech To Text avatar
Sonartext Speech To Text
Under maintenance

Pricing

Pay per event

Go to Apify Store
Sonartext Speech To Text

Sonartext Speech To Text

Under maintenance

Developed by

Kyle

Kyle

Maintained by Community

SonarText Speech to Text Transcription Service

0.0 (0)

Pricing

Pay per event

1

2

2

Last modified

20 days ago

Sonartext STT Actor - Professional Speech-to-Text Transcription

Transform audio content into high-quality text transcriptions with enterprise-grade accuracy using state-of-the-art AI models. Perfect for content creators, researchers, businesses, and developers who need reliable speech-to-text conversion.

πŸš€ Key Features

  • 🎯 State-of-the-Art AI: Powered by Whisper-v3-large-turbo for industry-leading accuracy
  • πŸ‘₯ Speaker Diarization: Automatically identify and separate different speakers
  • πŸ“ Multiple Input Sources: Support for direct uploads, YouTube videos, Twitter audio, Google Drive, AWS S3, and URLs
  • ⚑ Fast Processing: Optimized for speed without compromising quality
  • πŸ’° Cost Control: Built-in cost estimation and limits to manage expenses
  • πŸ“Š Flexible Output: Multiple formats including JSON, plain text, SRT, and VTT
  • πŸ” Enterprise Security: Secure API integration with proper error handling
  • πŸ“ˆ Usage Analytics: Detailed processing statistics and performance metrics

🎯 Use Cases

Content Creation & Media

  • Podcast Transcription: Convert episodes to searchable text for show notes and SEO
  • Video Subtitles: Generate accurate captions for YouTube, social media, and streaming
  • Interview Processing: Transcribe interviews with automatic speaker identification
  • Course Materials: Create accessible text versions of educational audio content

Business & Research

  • Meeting Transcription: Document important discussions with speaker attribution
  • Research Analysis: Convert audio interviews and focus groups to analyzable text
  • Call Center Analytics: Process customer service calls for quality assurance
  • Legal Documentation: Transcribe depositions and hearings with high accuracy

Accessibility & Compliance

  • ADA Compliance: Provide text alternatives for audio content
  • Multi-language Support: Transcribe content in multiple languages
  • Hearing Accessibility: Make audio content accessible to deaf and hard-of-hearing users

πŸ“₯ Input Configuration

Required Parameters

{
"audioSource": {
"method": "url",
"fileUrl": "https://example.com/audio.mp3"
}
}

Complete Configuration Example

{
"audioSource": {
"method": "youtube",
"fileUrl": "https://youtube.com/watch?v=example",
"youTubeOptions": {
"audioQuality": "highest"
}
},
"transcriptionOptions": {
"outputFormat": "json",
"speakerDiarization": true,
"wordTimestamps": true,
"maxCostCents": 500
},
"outputOptions": {
"saveFiles": true,
"includeRawResponse": false
}
}

Input Source Options

MethodDescriptionExample
urlDirect HTTP/HTTPS URL or local filehttps://cdn.example.com/audio.mp3
uploadFile upload via bufferUsed with file upload interface
youtubeYouTube video URLhttps://youtube.com/watch?v=dQw4w9WgXcQ
twitterTwitter/X post with audio/videohttps://twitter.com/user/status/123456
gdriveGoogle Drive file (public)https://drive.google.com/file/d/FILE_ID
s3AWS S3 objects3://bucket-name/path/to/file.mp3

Advanced Options

{
"transcriptionOptions": {
"outputFormat": "srt", // json, text, srt, vtt
"speakerDiarization": true, // Identify different speakers
"wordTimestamps": true, // Include word-level timing
"maxCostCents": 1000 // Maximum cost limit (in cents)
},
"outputOptions": {
"saveFiles": true, // Save files to Apify dataset
"includeRawResponse": true // Include full API response
}
}

πŸ“€ Output Structure

JSON Format (Default)

{
"success": true,
"result": {
"transcription": "Hello, welcome to our podcast. Today we're discussing AI innovations.",
"speakers": [
{
"speaker": "Speaker 1",
"start": 0.0,
"end": 3.2,
"text": "Hello, welcome to our podcast."
},
{
"speaker": "Speaker 2",
"start": 3.5,
"end": 7.8,
"text": "Today we're discussing AI innovations."
}
],
"metadata": {
"duration": 45.6,
"wordCount": 124,
"confidence": 0.94,
"language": "en"
}
},
"usage": {
"costCents": 23,
"processingTime": 12.5,
"audioMinutes": 0.76
}
}

SRT Format Example

1
00:00:00,000 --> 00:00:03,200
Hello, welcome to our podcast.
2
00:00:03,500 --> 00:00:07,800
Today we're discussing AI innovations.

Error Handling

{
"success": false,
"error": {
"code": "AUDIO_TOO_LONG",
"message": "Audio duration exceeds maximum limit",
"details": {
"maxDuration": 3600,
"actualDuration": 4200
}
}
}

πŸš€ Getting Started

1. Basic Usage

{
"audioSource": {
"method": "url",
"fileUrl": "https://example.com/meeting-recording.mp3"
},
"transcriptionOptions": {
"speakerDiarization": true
}
}

2. Run the Actor

Execute the actor and retrieve your transcription from the dataset.

πŸ’‘ Pro Tips

Optimize Audio Quality

  • Clear Audio: Use high-quality recordings with minimal background noise
  • Proper Levels: Ensure consistent audio levels across speakers
  • Format: MP3, WAV, and FLAC formats work best

Speaker Diarization Best Practices

  • Enable for multi-speaker content (meetings, interviews, podcasts)
  • Works best with 2-10 distinct speakers
  • Clearer speaker separation improves accuracy

Cost Management

  • Set maxCostCents to control expenses
  • Use cost estimation feature before processing large files
  • Consider audio length: pricing is typically per minute

YouTube Processing

  • Supports both public and unlisted videos
  • Automatically extracts best quality audio
  • Handles various video lengths efficiently

πŸ’° Pricing

This actor uses Sonartext's professional STT service with transparent, usage-based pricing:

  • Event Based Pricing: $0.01 base fee per Actor Start
  • Usage Fees: Pay $0.004 per minute of audio processed
  • Volume Discounts: Available for high-usage customers
  • Cost Control: Set maximum spending limits per job

πŸ”’ Security & Privacy

  • API Security: All requests use secure HTTPS with API key authentication
  • Data Privacy: Audio files are processed securely and not stored permanently
  • Compliance: Built for enterprise use with privacy-first architecture
  • Rate Limiting: Respectful API usage with built-in rate limiting
  • Content Rights: Ensure you have proper rights to transcribe the audio content
  • Privacy Compliance: Follow applicable privacy laws (GDPR, CCPA, etc.) when processing personal data
  • Platform Terms: Respect YouTube's Terms of Service and other platform policies
  • Copyright: Be mindful of copyrighted content when processing media files

πŸ› οΈ Advanced Features

Batch Processing

Process multiple files efficiently by chaining multiple actor runs or using the batch processing capabilities.

Integration Options

  • APIs: Direct REST API integration available
  • Webhooks: Real-time processing notifications
  • Zapier: No-code workflow integration
  • Custom Solutions: Enterprise integrations available

Supported Audio Formats

  • Audio: MP3, WAV, FLAC, AAC, OGG, M4A
  • Video: MP4, AVI, MOV, MKV (audio extracted)
  • Containers: Support for most common audio/video containers

πŸ“ž Support & Documentation

Help & Support

  • Documentation: Full API Documentation
  • Support: Contact support through the Apify platform
  • Community: Join our Discord for community support
  • Enterprise: Dedicated support for business customers

Troubleshooting

  • Rate Limits: Wait and retry if you hit API limits
  • Large Files: Consider breaking very large files into segments
  • Quality Issues: Check audio quality and format compatibility
  • Costs: Monitor usage through the actor's cost reporting features

Updates & Changelog

This actor is actively maintained with regular updates for:

  • New AI model versions
  • Additional output formats
  • Performance improvements
  • Enhanced error handling

πŸ“Š Actor Metrics

  • Response Time: < 30 seconds for typical audio files
  • Accuracy: 95%+ for clear audio with proper formatting
  • Supported Languages: 50+ languages supported
  • File Size Limit: Up to 500MB per file
  • Duration Limit: Up to 6 hours per audio file

Developed by: Sonartext
Version: 1.0.0
Last Updated: September 2025
License: MIT

For technical issues or feature requests, please use the Apify platform's support system or visit our documentation.