Pricing

from $0.15 / 1,000 audio second processeds

Transcribe Zoom Meeting to Text — Bulk Meeting Transcription

Transcribe Zoom recordings to text in bulk. Speaker labels for host and participants, word-level timestamps, SRT/VTT export. 99+ languages. Try free.

Pricing

from $0.15 / 1,000 audio second processeds

Rating

0.0

(0)

Developer

SIÁN OÜ

Actor stats

Bookmarked

Total users

Monthly active users

11 days ago

Last modified

How to transcribe a Zoom recording in 4 steps

Drop in Zoom recording URLs — pull URLs from the Zoom Cloud Recording API (/v2/users/{userId}/recordings) and feed them straight into the audioUrls field. Or upload local .mp4/.m4a exports via audioFiles.
Pick your options — auto-detect language, toggle speaker diarization to separate host from participants, optionally translate non-English meetings to English.
Run the actor — recordings process 10 at a time in parallel on the paid tier; an entire team's weekly recordings can be transcribed in minutes.
Download results — every recording lands in the dataset with the transcript, segment + word-level timestamps, speaker labels, and ready-to-use SRT/VTT subtitle strings.

Supported formats: MP4, MOV, M4A, MP3, WAV, FLAC, AAC, OPUS, OGG, WebM. Max 1 GB per file on the paid tier.

Example output — Zoom meeting transcript with speaker labels

{
  "transcript": "Thanks for joining everyone — let's start with the launch checklist. So Sarah, where are we on the email sequence?...",
  "detected_language": "en",
  "duration": 1842.5,
  "segments": [
    {
      "id": 0,
      "text": "Thanks for joining everyone — let's start with the launch checklist.",
      "start": 0.18,
      "end": 4.92,
      "speaker": "SPEAKER_00",
      "language": "en",
      "words": [
        { "word": "Thanks", "start": 0.18, "end": 0.62, "speaker": "SPEAKER_00" }
      ]
    },
    {
      "id": 1,
      "text": "So Sarah, where are we on the email sequence?",
      "start": 5.40,
      "end": 8.12,
      "speaker": "SPEAKER_00",
      "language": "en",
      "words": []
    }
  ],
  "srt": "1\n00:00:00,180 --> 00:00:04,920\nThanks for joining everyone — let's start with the launch checklist.\n\n2\n00:00:05,400 --> 00:00:08,120\nSo Sarah, where are we on the email sequence?",
  "vtt": "WEBVTT\n\n00:00:00.180 --> 00:00:04.920\nThanks for joining everyone — let's start with the launch checklist.\n\n00:00:05.400 --> 00:00:08.120\nSo Sarah, where are we on the email sequence?",
  "speakers": ["SPEAKER_00", "SPEAKER_01", "SPEAKER_02"],
  "languages": ["en"],
  "fileSizeMB": 84.3,
  "success": true
}

Every result includes the full transcript, segment-level timestamps, word-level timestamps, language detection, recording duration in seconds, file size, ready-to-use srt and vtt subtitle strings, and (when speaker diarization is enabled) speaker labels per segment and per word.

Built for sales, support, and ops teams

📞 Sales call coaching — review every demo, identify objection-handling gaps, share quote-level highlights
🎧 Customer support QA — sample escalations, score CSAT-affecting moments, train new agents
🗂️ Meeting archives — searchable transcripts of every quarterly review, planning meeting, and stakeholder sync
⚖️ Compliance review — tamper-evident transcripts of regulated conversations
🤝 Sales hand-off notes — auto-generate transcripts from discovery calls so AEs and CSMs share the same source of truth

The killer combo: n8n / Zapier trigger on Zoom Cloud Recording → this actor → your Notion/Slack/CRM. Zero manual steps.

Speaker diarization for host and participants

Toggle the Speaker Diarization input to separate every speaker in the meeting. Each segment and each word receives a speaker label (SPEAKER_00, SPEAKER_01, …), so you can build clean transcripts where the host's questions and each participant's answers are clearly distinguished. Powered by pyannote-audio. Charged per audio second; only billed when enabled.

SRT / VTT export for internal video archives

Every transcription returns ready-to-use srt and vtt subtitle strings. Save the field value as a .srt or .vtt file and:

Caption an internal video library (Notion, Confluence, SharePoint, Drive)
Build a searchable training-material archive for new hire onboarding
Add HTML5 <track> accessibility captions to embedded recordings

Set Timestamp Granularities to word for cue precision down to individual words.

Why teams choose this Zoom transcriber

✅ API-first integration — built to plug into n8n, Zapier, Make, or your own webhook on the Zoom Cloud Recording event
🎤 Host vs participant separation via pyannote-audio diarization — clean attributed transcripts for sales coaching and QA
⏱️ Word-level timestamps for every word — find any quote in a 90-minute call in seconds
🎬 SRT and VTT included on every successful run — caption internal training videos with no extra step
🌍 99+ languages — global teams, multilingual customers, EU offices all supported
🇪🇺 EU-region processing for GDPR-aligned workflows
💰 Pay per audio second — no Otter subscription, no minimums; only pay for the recordings you actually transcribe
🚀 10× parallel on the paid tier — an entire team's weekly recordings done in minutes, not hours

Use cases

📞 Sales teams auto-transcribing every demo and discovery call for coaching, deal review, and CRM enrichment
🎧 Customer support leaders sampling escalation calls for QA scoring and agent training
🗂️ Operations teams archiving every all-hands, planning meeting, and quarterly review as a searchable text record
⚖️ Compliance officers maintaining tamper-evident transcripts of regulated conversations (finance, healthcare, legal)
🤝 Account managers generating clean handoff notes between SDR → AE → CSM stages
🌐 Localization teams translating multilingual customer calls to English for internal review
🧪 Product researchers transcribing user interviews and usability sessions for thematic coding

Pricing & tiers

Pay only for the audio seconds you actually transcribe. No subscriptions, no minimums.

FREE tier	PAID tier
Perfect for testing and small jobs	Built for production volume
Up to 5 recordings per run	Unlimited recordings per run
50 MB max per file	1 GB max per file
200 MB / 20 minutes monthly	Unlimited monthly volume
3 concurrent files	10 concurrent files (10× parallel)
No credit card required	$0.0005 per audio second

Optional add-ons (only billed when enabled):

Feature	Price
Speaker diarization	$0.0001 per audio second
Translate to English	$0.0003 per audio second
EU-region processing	$0.0007 per audio second (replaces base $0.0005)

A 60-minute meeting with diarization on the paid tier costs approximately $2.16 ($1.80 transcription + $0.36 diarization). Compare to per-seat meeting-transcription SaaS pricing.

Integration examples

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });

const run = await client.actor('sian.agency/transcribe-zoom-meeting-to-text').call({
    audioUrls: ['https://your-org.zoom.us/recording/share/...'],
    speakerDiarization: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0].transcript);
console.log(items[0].srt);

Python

from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')

run = client.actor('sian.agency/transcribe-zoom-meeting-to-text').call(run_input={
    'audioUrls': ['https://your-org.zoom.us/recording/share/...'],
    'speakerDiarization': True,
})

items = client.dataset(run['defaultDatasetId']).list_items().items
print(items[0]['transcript'])
print(items[0]['vtt'])

cURL

curl -X POST 'https://api.apify.com/v2/acts/sian.agency~transcribe-zoom-meeting-to-text/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "audioUrls": ["https://your-org.zoom.us/recording/share/..."],
    "speakerDiarization": true
  }'

n8n / Zapier / Make

Wire this actor onto the Zoom Cloud Recording Completed webhook. Pass the recording URL into audioUrls, capture transcript, segments[].words[], srt, and vtt from the dataset, then route to Notion (meeting notes), Slack (digest), Salesforce/HubSpot (call attached to deal), or Google Sheets (compliance log) — no transformation step needed.

FAQ

How accurate is Zoom meeting transcription? Powered by an industrial speech-to-text pipeline tuned for natural conversation. Accuracy is typically 95–99% on Zoom audio with clean microphones, lower on speakerphone or noisy environments. Word-level timestamps are returned even when accuracy is imperfect, so you can verify and correct quote attributions quickly.

What audio and video formats are supported? MP4, MOV, M4A, MP3, WAV, FLAC, AAC, OPUS, OGG, WebM. Zoom Cloud Recordings export as MP4 or M4A, both work directly. Max 50 MB per file on the free tier, 1 GB on the paid tier.

Can I transcribe non-English meetings? Yes — auto-detection across 99+ languages including Spanish, French, German, Mandarin, Japanese, Portuguese, Arabic, Hindi, and many more. Toggle Translate to English to receive an English transcript alongside the timestamps.

Is speaker diarization included? Yes, opt-in via the Speaker Diarization toggle. Each segment and word gets labeled SPEAKER_00, SPEAKER_01, etc. Powered by pyannote-audio. Billed at $0.0001 per audio second only when enabled.

How does pricing work? Pay-per-audio-second. The free tier covers small jobs and testing without a credit card. The paid tier is $0.0005 per second of audio, plus optional add-ons for diarization, translation, and EU processing. No subscriptions, no per-seat fees.

Can I integrate this with n8n, Zapier, or Make? Yes — that's the primary use case. The actor exposes a standard Apify run/dataset API. Trigger on the Zoom Cloud Recording Completed event, run the actor, route the dataset record into your downstream tools.

Does this work with Zoom cloud recordings or only local files? Both. Pass cloud recording URLs from the Zoom API directly, or upload local MP4/M4A exports. The actor downloads and transcribes either source.

What about Microsoft Teams, Google Meet, or Webex recordings? Yes. Despite the niche name, this actor accepts any meeting recording URL or file from any platform. Zoom is just the most common source.

Legal disclaimer

Use this actor only on meeting recordings you have rights to transcribe — your own meetings, recordings with all participants' consent, or properly licensed content. Some jurisdictions require all-party consent for recording calls; you are responsible for compliance. The actor does not retain audio or transcripts beyond the run's lifetime. EU-region processing is available via the EU Processing toggle for GDPR-aligned workflows. SIÁN Agency provides this actor as-is.

Support

Join the Telegram support group, email apify@sian-agency.online, or open an issue on the SIÁN Agency Apify Store page.

More from SIÁN Agency

Platform-specific scrapers + transcribers:

Browse the full SIÁN Agency Apify Store for all available actors.

Transcribe Video to Text & Audio to Text — 99+ Languages

sian.agency/INCREDIBLY-FAST-audio-transcriber

Transcribe video to text and audio to text in bulk on Apify. 99+ languages, word-level timestamps, speaker diarization, SRT/VTT export. Try free.

SIÁN OÜ

132

5.0

Transcribe Voice Memo to Text — Speaker Labels & Timestamps

sian.agency/transcribe-voice-memo-to-text

Transcribe iPhone and Android voice memos to text. Speaker labels, word-level timestamps, SRT/VTT. Bulk upload, 99+ languages. Try free.

SIÁN OÜ

Transcribe Podcast to Text — Show Notes, SRT & Timestamps

sian.agency/transcribe-podcast-to-text

Transcribe podcast episodes to text in bulk. Speaker labels for hosts and guests, word-level timestamps, SRT/VTT for show notes. 99+ languages.

SIÁN OÜ

Transcribe Interview to Text — for Journalists & Researchers

sian.agency/transcribe-interview-to-text

Transcribe interviews and recorded conversations to text. Speaker labels for interviewer and guest, word-level timestamps, SRT/VTT. Try free.

SIÁN OÜ

Speech to Text — Audio Transcription API, 100+ Languages

vivid_astronaut/speech-to-text

Transcribe audio to text with high accuracy in 100+ languages, with speaker detection and word timestamps. Input an audio file, get structured transcript JSON — ready for subtitles, meeting notes, and voice apps.

Fabio Suizu

Zoom Marketplace Category Tracker

crawlerbros/zoom-marketplace-category-tracker

Track all Zoom App Marketplace apps for a specific Zoom product (Meeting, Phone, Webinar, etc.) in their default marketplace ranking order. Returns rank position among all apps for that product

Crawler Bros

Zoom App Marketplace Scraper

crawlerbros/zoom-marketplace-scraper

Scrape the Zoom App Marketplace, browse all 3400+ apps, search by keyword, or filter by Zoom product (Meeting, Phone, Webinar, etc.). Returns app name, company, description, icon, usage type, FedRAMP status, and ratings.

Crawler Bros

Zoom API Actor

alizarin_refrigerator-owner/zoom-api-actor

Access Zoom meetings, webinars, participants, cloud recordings & analytics Meetings: List & manage Zoom meetings Webinars: Webinar data & analytics Participants: Attendee data for meetings Registrants: Webinar registration data Recordings: Cloud recording access Reports: Meeting & usage analytics

The Howlers

Audio & Video Transcription + Speaker Diarization + SRT

vivid_astronaut/audio-video-transcription-diarization

Transcribe YouTube, TikTok, Instagram and direct audio/video with speaker diarization and SRT/VTT/TXT export. Flat $0.008/min, no OpenAI or other API key required.

Fabio Suizu

tl;dv Video Downloader (TLDV Google Meet Recording Download)

madeingermany/tldv-video-downloader

Downloads Google Meet's meeting recordings from tl;dv using meeting URLs and auth tokens. Bulk upload meeting links, allows multiple concurrent meetings to be downloaded. $5 per meeting :)