Pricing

Pay per event

Try for free

Go to Apify Store

Audio Transcriber

Try for free

Automates audio transcription from multiple sources (files or links). Normalizes input format to ensure optimal processing. Generates word-for-word transcriptions maintaining references to source audio, perfect for datasets requiring traceability and regulatory compliance.

Pricing

Pay per event

Rating

5.0

(1)

Developer

ParseForge

Actor stats

Bookmarked

143

Total users

Monthly active users

4 days ago

Last modified

🎤 Audio Transcriber

🚀 Convert speech to text in seconds. Upload audio files and get accurate transcriptions. Supports multiple languages. No coding, no transcription accounts required.

Pull structured records from Audio Transcriber — clean fields ready as CSV, JSON, JSONL, Excel, or XML for downstream pipelines.

Copy to your AI assistant

Copy this block into ChatGPT, Claude, Cursor, or any LLM to start using this actor.

parseforge/audio-transcriber on Apify. Call: ApifyClient("TOKEN").actor("parseforge/audio-transcriber").call(run_input={...}), then client.dataset(run["defaultDatasetId"]).list_items().items for results. Key inputs: audioFileUrl (array, default ["https://upload.wikimedia.org/wikipedia/commons/e/e9/Mia_Lo), language (string, default "en"). Full actor spec: fetch build via GET https://api.apify.com/v2/acts/parseforge~audio-transcriber (Bearer TOKEN). Get token: https://console.apify.com/account/integrations

Convert audio recordings to clean, structured text without juggling transcription tools or paying per-minute fees. The Actor accepts one or more audio file URLs (MP3, WAV, AIFF, AAC, OGG, FLAC, M4A and similar), runs each through an AI transcription pipeline, and returns the full transcript in your dataset. Built for podcasters, journalists, researchers, meeting teams, and any workflow that turns spoken audio into searchable text.

The output is a structured record per file: a back-reference to the input URL, the full transcription, a timestamp, and an error field if something fails. Hand the dataset off to your editor, summarizer, or downstream pipeline. Every run is processed live, so there is no upload cap or vendor lock-in.

👥 Built for	🎯 Primary use cases
Podcasters and creators	Generate episode transcripts and show notes
Journalists and researchers	Convert recorded interviews into searchable text
Meeting and operations teams	Auto-transcribe Zoom and Teams recordings
Content marketing	Repurpose webinars into blog posts and shorts
Accessibility teams	Produce captions and transcripts for compliance
Localization workflows	Get base text ready for translation pipelines

📋 What the Audio Transcriber does

🎧 Audio input. Accepts one or more audio file URLs in common formats (MP3, WAV, AIFF, AAC, OGG, FLAC, M4A).
🌐 Language hint. Pass an ISO 639-1 language code (e.g. en, es, fr, pt) to bias the model toward the right phonetics and vocabulary.
📝 Full transcription. Returns the complete text of each audio file as a single string per record.
🆔 Back-reference. Every record includes the original audio URL so you can rejoin transcripts to source files.
⏱️ Timestamp. Every record carries a timestamp field with the time the transcript was produced.
❗ Per-file error reporting. If a file fails (corrupt, unsupported, unreadable URL) the error appears on its own record without breaking the run.

The actor processes uploads in the order you provide them. Records stream into the dataset as transcripts complete, so you can start consuming results before the run is fully finished. Manual transcription typically takes 4-6 hours per hour of audio; this Actor returns the same text in minutes.

💡 Why it matters: spoken audio is everywhere (podcasts, interviews, meetings) but most data tooling is text-first. A reliable speech-to-text step unlocks search, summarization, translation, and analytics workflows that would otherwise be impossible.

📊 Data fields

Each record includes: audioReference, timestamp, transcription. All 3 field names come from a real production run, so what you see here is what lands in your dataset.

⚠️ Good to Know: the audio URL must be publicly reachable. If your file lives in a private bucket, generate a signed URL valid for the run's duration before passing it in.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Audio Transcriber page on the Apify Store.
🎯 Add your audio. Paste one or more audio URLs into audioFileUrl and (optionally) set language.
🚀 Run it. Click Start and let the Actor transcribe each file.
📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to first transcript: 3-5 minutes for a short clip.

🔗 Recommended Actors

🎬 YouTube AI Transcriber - Transcribe YouTube videos via URL with full metadata
🖼️ Auto Video Thumbnail Generator - Auto-generate thumbnails from video uploads
📰 Article Extractor - Extract clean article text from any URL
📄 PDF to JSON Parser - Convert PDFs into structured JSON
🔍 RAG Web Browser - Fetch clean text for AI retrieval pipelines

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

⚠️ Disclaimer. This Actor is an independent tool. The scraper accesses only audio you supply by URL and is intended for legitimate research, productivity, and content workflows. Users are responsible for ensuring they hold the rights to transcribe the audio they submit and for compliance with copyright, privacy, and consent laws in their jurisdiction.

🆘 Need Help?

If you hit a bug, have questions about setup, or need a scraper we haven't built yet, open our contact form or write to parseforge@protonmail.com. We also take on paid custom data projects.

For faster answers, join our Discord. It's the best place to get support and suggest new actors.

Speech to Text — Audio Transcription API, 100+ Languages

vivid_astronaut/speech-to-text

Transcribe audio to text with high accuracy in 100+ languages, with speaker detection and word timestamps. Input an audio file, get structured transcript JSON — ready for subtitles, meeting notes, and voice apps.

BRAINIALL Team

TikTok Audio Downloader

maximedupre/tiktok-audio-downloader

Download audio from public TikTok video URLs. Get hosted audio files, direct source links, captions, thumbnails, creator details, and engagement counts in one dataset.

Maxime Dupré

Instagram Audio Downloader

alpha-scraper/instagram-audio-downloader

Instagram Audio Downloader 🎵 Extract playable audio URLs from Instagram videos and reels. Supports multiple links, returns clean metadata, and delivers direct audio access—ideal for automation, research, and content workflows.

Alpha Scraper

Instagram Audio Scraper - Reels by Audio, Song & Sound

khadinakbar/instagram-audio-scraper

Scrape public Instagram audio usage from audio IDs, audio URLs, Reel URLs, profile Reels, and Reel search queries. Returns audio metadata, Reel URLs, engagement metrics, media links, author metadata, and provider diagnostics. No cookies required. MCP/API-ready.

Khadin Akbar

Youtube Mp3 Audio Downloader

scrapers-hub/youtube-mp3-audio-downloader

YouTube MP3 audio downloader to convert and download audio from YouTube videos 🎧📥 Perfect for offline listening, content reuse, and audio extraction. Fast, high-quality, and easy to use.

Scrapers Hub

Audio Noise Remover

parseforge/noise-remover

Remove background noise from audio files with support for multiple formats. Upload any audio file (MP3, WAV, M4A, FLAC, OGG, AAC) and get a clean, professional-quality audio file. Perfect for podcasters, content creators, and anyone who needs to clean up audio recordings.

ParseForge

5.0

Instagram Reels Audio Downloader

alpha-scraper/instagram-reels-audio-downloader

Extract high-quality audio URLs from Instagram Reels instantly. Supports multiple links, returns structured metadata including audio formats, duration, likes, comments, and upload date. Fast, reliable, and perfect for automation and data collection.

Alpha Scraper

TikTok Audio Downloader 🎵

alpha-scraper/tiktok-audio-downloader

Super fast & No proxy needed! 🎵 Extract high-quality, playable audio URLs from TikTok video links. Supports multiple videos, delivers clean metadata, and saves audio to dataset & key-value store—ideal for automation, research, and content workflows.

Alpha Scraper

Audio Converter API

vivid_astronaut/audio-converter

BRAINIALL Team

Video & Audio Transcriber — Word-Level + SRT/VTT

dami_studio/video-audio-transcriber

Transcribes any public video or audio URL (mp4, mov, mp3, wav, m4a, webm) into text with word-level and segment timestamps, plus downloadable SRT, VTT, and TXT files. Auto-detects language. Main use: generate subtitle files from a video.