Transcribe Zoom Meeting to Text — Bulk Meeting Transcription
Pricing
from $0.15 / 1,000 audio second processeds
Transcribe Zoom Meeting to Text — Bulk Meeting Transcription
Transcribe Zoom recordings to text in bulk. Speaker labels for host and participants, word-level timestamps, SRT/VTT export. 99+ languages. Try free.
Pricing
from $0.15 / 1,000 audio second processeds
Rating
0.0
(0)
Developer
SIÁN OÜ
Actor stats
1
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Transcribe Zoom recordings to text in bulk. Plug into n8n, Zapier, or Make to auto-transcribe every cloud recording the moment it lands — no Otter subscription, no manual upload. Speaker labels for host and participants, word-level timestamps, SRT/VTT subtitles, 99+ languages.
How to transcribe a Zoom recording in 4 steps
- Drop in Zoom recording URLs — pull URLs from the Zoom Cloud Recording API (
/v2/users/{userId}/recordings) and feed them straight into theaudioUrlsfield. Or upload local.mp4/.m4aexports viaaudioFiles. - Pick your options — auto-detect language, toggle speaker diarization to separate host from participants, optionally translate non-English meetings to English.
- Run the actor — recordings process 10 at a time in parallel on the paid tier; an entire team's weekly recordings can be transcribed in minutes.
- Download results — every recording lands in the dataset with the transcript, segment + word-level timestamps, speaker labels, and ready-to-use SRT/VTT subtitle strings.
Supported formats: MP4, MOV, M4A, MP3, WAV, FLAC, AAC, OPUS, OGG, WebM. Max 1 GB per file on the paid tier.
Example output — Zoom meeting transcript with speaker labels
{"transcript": "Thanks for joining everyone — let's start with the launch checklist. So Sarah, where are we on the email sequence?...","detected_language": "en","duration": 1842.5,"segments": [{"id": 0,"text": "Thanks for joining everyone — let's start with the launch checklist.","start": 0.18,"end": 4.92,"speaker": "SPEAKER_00","language": "en","words": [{ "word": "Thanks", "start": 0.18, "end": 0.62, "speaker": "SPEAKER_00" }]},{"id": 1,"text": "So Sarah, where are we on the email sequence?","start": 5.40,"end": 8.12,"speaker": "SPEAKER_00","language": "en","words": []}],"srt": "1\n00:00:00,180 --> 00:00:04,920\nThanks for joining everyone — let's start with the launch checklist.\n\n2\n00:00:05,400 --> 00:00:08,120\nSo Sarah, where are we on the email sequence?","vtt": "WEBVTT\n\n00:00:00.180 --> 00:00:04.920\nThanks for joining everyone — let's start with the launch checklist.\n\n00:00:05.400 --> 00:00:08.120\nSo Sarah, where are we on the email sequence?","speakers": ["SPEAKER_00", "SPEAKER_01", "SPEAKER_02"],"languages": ["en"],"fileSizeMB": 84.3,"success": true}
Every result includes the full transcript, segment-level timestamps, word-level timestamps, language detection, recording duration in seconds, file size, ready-to-use srt and vtt subtitle strings, and (when speaker diarization is enabled) speaker labels per segment and per word.
Built for sales, support, and ops teams
- 📞 Sales call coaching — review every demo, identify objection-handling gaps, share quote-level highlights
- 🎧 Customer support QA — sample escalations, score CSAT-affecting moments, train new agents
- 🗂️ Meeting archives — searchable transcripts of every quarterly review, planning meeting, and stakeholder sync
- ⚖️ Compliance review — tamper-evident transcripts of regulated conversations
- 🤝 Sales hand-off notes — auto-generate transcripts from discovery calls so AEs and CSMs share the same source of truth
The killer combo: n8n / Zapier trigger on Zoom Cloud Recording → this actor → your Notion/Slack/CRM. Zero manual steps.
Speaker diarization for host and participants
Toggle the Speaker Diarization input to separate every speaker in the meeting. Each segment and each word receives a speaker label (SPEAKER_00, SPEAKER_01, …), so you can build clean transcripts where the host's questions and each participant's answers are clearly distinguished. Powered by pyannote-audio. Charged per audio second; only billed when enabled.
SRT / VTT export for internal video archives
Every transcription returns ready-to-use srt and vtt subtitle strings. Save the field value as a .srt or .vtt file and:
- Caption an internal video library (Notion, Confluence, SharePoint, Drive)
- Build a searchable training-material archive for new hire onboarding
- Add HTML5
<track>accessibility captions to embedded recordings
Set Timestamp Granularities to word for cue precision down to individual words.
Why teams choose this Zoom transcriber
- ✅ API-first integration — built to plug into n8n, Zapier, Make, or your own webhook on the Zoom Cloud Recording event
- 🎤 Host vs participant separation via pyannote-audio diarization — clean attributed transcripts for sales coaching and QA
- ⏱️ Word-level timestamps for every word — find any quote in a 90-minute call in seconds
- 🎬 SRT and VTT included on every successful run — caption internal training videos with no extra step
- 🌍 99+ languages — global teams, multilingual customers, EU offices all supported
- 🇪🇺 EU-region processing for GDPR-aligned workflows
- 💰 Pay per audio second — no Otter subscription, no minimums; only pay for the recordings you actually transcribe
- 🚀 10× parallel on the paid tier — an entire team's weekly recordings done in minutes, not hours
Use cases
- 📞 Sales teams auto-transcribing every demo and discovery call for coaching, deal review, and CRM enrichment
- 🎧 Customer support leaders sampling escalation calls for QA scoring and agent training
- 🗂️ Operations teams archiving every all-hands, planning meeting, and quarterly review as a searchable text record
- ⚖️ Compliance officers maintaining tamper-evident transcripts of regulated conversations (finance, healthcare, legal)
- 🤝 Account managers generating clean handoff notes between SDR → AE → CSM stages
- 🌐 Localization teams translating multilingual customer calls to English for internal review
- 🧪 Product researchers transcribing user interviews and usability sessions for thematic coding
Pricing & tiers
Pay only for the audio seconds you actually transcribe. No subscriptions, no minimums.
| FREE tier | PAID tier |
|---|---|
| Perfect for testing and small jobs | Built for production volume |
| Up to 5 recordings per run | Unlimited recordings per run |
| 50 MB max per file | 1 GB max per file |
| 200 MB / 20 minutes monthly | Unlimited monthly volume |
| 3 concurrent files | 10 concurrent files (10× parallel) |
| No credit card required | $0.0005 per audio second |
Optional add-ons (only billed when enabled):
| Feature | Price |
|---|---|
| Speaker diarization | $0.0001 per audio second |
| Translate to English | $0.0003 per audio second |
| EU-region processing | $0.0007 per audio second (replaces base $0.0005) |
A 60-minute meeting with diarization on the paid tier costs approximately $2.16 ($1.80 transcription + $0.36 diarization). Compare to per-seat meeting-transcription SaaS pricing.
Integration examples
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('sian.agency/transcribe-zoom-meeting-to-text').call({audioUrls: ['https://your-org.zoom.us/recording/share/...'],speakerDiarization: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0].transcript);console.log(items[0].srt);
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_APIFY_TOKEN')run = client.actor('sian.agency/transcribe-zoom-meeting-to-text').call(run_input={'audioUrls': ['https://your-org.zoom.us/recording/share/...'],'speakerDiarization': True,})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items[0]['transcript'])print(items[0]['vtt'])
cURL
curl -X POST 'https://api.apify.com/v2/acts/sian.agency~transcribe-zoom-meeting-to-text/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN' \-H 'Content-Type: application/json' \-d '{"audioUrls": ["https://your-org.zoom.us/recording/share/..."],"speakerDiarization": true}'
n8n / Zapier / Make
Wire this actor onto the Zoom Cloud Recording Completed webhook. Pass the recording URL into audioUrls, capture transcript, segments[].words[], srt, and vtt from the dataset, then route to Notion (meeting notes), Slack (digest), Salesforce/HubSpot (call attached to deal), or Google Sheets (compliance log) — no transformation step needed.
FAQ
How accurate is Zoom meeting transcription? Powered by an industrial speech-to-text pipeline tuned for natural conversation. Accuracy is typically 95–99% on Zoom audio with clean microphones, lower on speakerphone or noisy environments. Word-level timestamps are returned even when accuracy is imperfect, so you can verify and correct quote attributions quickly.
What audio and video formats are supported? MP4, MOV, M4A, MP3, WAV, FLAC, AAC, OPUS, OGG, WebM. Zoom Cloud Recordings export as MP4 or M4A, both work directly. Max 50 MB per file on the free tier, 1 GB on the paid tier.
Can I transcribe non-English meetings? Yes — auto-detection across 99+ languages including Spanish, French, German, Mandarin, Japanese, Portuguese, Arabic, Hindi, and many more. Toggle Translate to English to receive an English transcript alongside the timestamps.
Is speaker diarization included?
Yes, opt-in via the Speaker Diarization toggle. Each segment and word gets labeled SPEAKER_00, SPEAKER_01, etc. Powered by pyannote-audio. Billed at $0.0001 per audio second only when enabled.
How does pricing work? Pay-per-audio-second. The free tier covers small jobs and testing without a credit card. The paid tier is $0.0005 per second of audio, plus optional add-ons for diarization, translation, and EU processing. No subscriptions, no per-seat fees.
Can I integrate this with n8n, Zapier, or Make? Yes — that's the primary use case. The actor exposes a standard Apify run/dataset API. Trigger on the Zoom Cloud Recording Completed event, run the actor, route the dataset record into your downstream tools.
Does this work with Zoom cloud recordings or only local files? Both. Pass cloud recording URLs from the Zoom API directly, or upload local MP4/M4A exports. The actor downloads and transcribes either source.
What about Microsoft Teams, Google Meet, or Webex recordings? Yes. Despite the niche name, this actor accepts any meeting recording URL or file from any platform. Zoom is just the most common source.
Legal disclaimer
Use this actor only on meeting recordings you have rights to transcribe — your own meetings, recordings with all participants' consent, or properly licensed content. Some jurisdictions require all-party consent for recording calls; you are responsible for compliance. The actor does not retain audio or transcripts beyond the run's lifetime. EU-region processing is available via the EU Processing toggle for GDPR-aligned workflows. SIÁN Agency provides this actor as-is.
Support
Join the Telegram support group, email support@sian-agency.online, or open an issue on the SIÁN Agency Apify Store page.
More from SIÁN Agency
Platform-specific scrapers + transcribers:
- Instagram AI Transcript Extractor
- Best TikTok AI Transcript Extractor
- YouTube Shorts AI Transcript Extractor
- Facebook AI Transcript Extractor
Browse the full SIÁN Agency Apify Store for all available actors.