Pricing

from $0.70 / 1,000 transcript extracteds

YouTube Transcript Scraper Pro (Captions + AI Fallback)

Extract YouTube transcripts at scale without burning through your budget. It starts with free captions whenever they're available, then switches to AI only for videos that don't have them. You stay in control of costs, and the output — JSON, SRT, VTT, plain text, or LLM-ready format

Pricing

from $0.70 / 1,000 transcript extracteds

Rating

0.0

(0)

Developer

CodePoetry

Actor stats

Bookmarked

985

Total users

447

Monthly active users

14 days

Issues response

14 days ago

Last modified

YouTube Transcript Scraper — Captions + AI Speech-to-Text

Extract transcripts from any YouTube video — even when captions don't exist.

Most transcript tools stop at videos with no captions. This one transcribes the audio instead — no external API key required.

Give it a single video, a full playlist, or an entire channel. Get transcripts in JSON, plain text, SRT, VTT, or an LLM-ready format — ready to download or feed into a pipeline. Processes up to 5 videos simultaneously. You pay per transcript, not per minute of server time.

Works as an MCP tool with Claude Desktop, Cursor, and any MCP-compatible client.

Quick start

Click Try for free on this actor's page.
Paste one or more YouTube URLs into the YouTube URLs field. Supported formats:
- Single video: youtube.com/watch?v=...
- Playlist: youtube.com/playlist?list=...
- Channel: youtube.com/@channelname
For playlists and channels the default is 10 videos — increase Max videos before starting if you want more.
Choose your Output Formats. Not sure? Start with Plain Text — it's the words as one block of text. Set Caption Languages if you need something other than English (default: en).
AI transcription is off by default — the lowest-cost setting. Videos without native captions are skipped unless you turn on Enable AI transcription in the input form. When on, memory adjusts to 4 GB automatically and AI processes at roughly 8× real-time (a 30-minute podcast takes about 4 minutes).

If you turned AI on: set a Max AI Minutes cap before running an unknown playlist or channel (default: 30 minutes). Native captions are checked first — AI only runs on videos that have none. Raise the cap or set it to 0 (unlimited) after estimating cost in the Pricing section.
Click Start. A single video with captions finishes in under 30 seconds. When the run completes, download results from the Dataset tab as JSON, CSV, or Excel — or access them via the API.

AI transcription is off by default — the lowest-cost setting. See the Pricing page for the full rate card.

How it works

Step 1: Expand Paste one or more URLs — single videos, playlists, or channel URLs. The actor resolves them into individual video URLs automatically.

Step 2: Extract For each video, the actor checks for native captions (manual or auto-generated) in your requested languages. If captions exist, they are fetched and formatted immediately — no audio download needed.

Step 3: Transcribe (when needed) If no captions are found and AI transcription is enabled, the actor downloads the audio and transcribes it using a bundled faster-whisper model running on Apify's compute — no external API needed. The output has the same structure as native caption output. Use Max AI minutes per run and Skip AI for long videos to control AI spend.

Failed videos are logged and skipped; the rest of the batch continues. Every item in the output dataset has an error_code field so you can filter results programmatically.

What you get

Each output item includes video metadata and your transcript in the formats you requested.

Video metadata

Field	Type	Description
`metadata.id`	string	YouTube video ID
`metadata.title`	string	Video title
`metadata.url`	string	Canonical watch URL
`metadata.channel`	string	Channel display name
`metadata.channel_id`	string	Channel ID (UC-prefixed)
`metadata.channel_url`	string	Channel URL
`metadata.description`	string	Full video description
`metadata.duration`	integer	Duration in seconds
`metadata.view_count`	integer	Total views
`metadata.like_count`	integer	Total likes
`metadata.upload_date`	string	Upload date (YYYYMMDD)
`metadata.thumbnail`	string	Highest-resolution thumbnail URL
`metadata.tags`	array	Creator-set tags
`metadata.categories`	array	YouTube categories

Transcript fields

Field	Type	Description
`language`	string	Language code of the transcript (e.g. `en`, `zh-TW`)
`is_auto_generated`	boolean	`true` if YouTube auto-generated the captions
`is_ai_generated`	boolean	`true` if transcribed by the built-in AI model
`transcript_json`	array	Timestamped segments `[{start, end, text}]`. When `wordLevel: true`, each segment also has a `words` array `[{start, end, text}]` — end is estimated for native captions, exact for AI transcriptions.
`transcript_text`	string	Plain text transcript
`transcript_llm`	string	Text with `[Music]`, `(laughter)`, and filler tokens stripped — ready for AI pipelines
`transcript_srt`	string	SRT subtitle format. Present when `srt` is in `outputFormats`.
`transcript_vtt`	string	WebVTT format
`language_probability`	number	AI model's confidence in the detected language (0–1). AI transcription only.
`language_was_forced`	boolean	`true` when `forceWhisperLanguage` was set. AI transcription only.
`ai_duration_charged_min`	integer	Minutes of AI time charged for this video. AI transcription only.
`ai_speech_duration_sec`	number	Actual speech duration detected by the model in seconds (informational). AI transcription only.
`available_languages`	array	Caption language codes YouTube provides on this video. Only present on `NO_CAPTIONS_AVAILABLE` and `LANGUAGE_NOT_FOUND` error items — use them to refine your `languages` input.
`error_code`	string	Structured error code when extraction failed. See Error codes for the full reference table.

Use cases

1. Claude Desktop / Claude.ai MCP integration

Connect this actor as an MCP server so Claude Desktop, Claude.ai Projects, Cursor, or any other MCP-compatible AI client can fetch a transcript just by being handed a YouTube URL. Ask Claude to "summarise this video" or "extract the key points from this lecture" — no copy-pasting required.

Recommended settings: outputFormats: ["llm"]

2. YouTube Shorts — per-word karaoke captions

YouTube Shorts often have auto-generated captions. Enable wordLevel: true to get per-word start times from the transcript_json field. Feed the result into a caption editor (CapCut, DaVinci Resolve, Adobe Premiere) to produce word-by-word highlighted captions — the "karaoke" style popular on short-form video.

Recommended settings: wordLevel: true, outputFormats: ["json", "srt"], subType: "auto"

3. Build a searchable knowledge base (RAG)

Bulk-extract every video from a company channel, educational YouTube account, or podcast series. Store the transcript_llm text in a vector database (Pinecone, Weaviate, pgvector) indexed by metadata.id and metadata.title. Use it as a retrieval-augmented generation (RAG) corpus so your chatbot can answer questions grounded in the exact video content.

Recommended settings: outputFormats: ["llm"], maxResults: 500, enableAiFallback: true. Enable AI transcription so caption-free videos in the channel are transcribed automatically.

4. NLP / sentiment analysis pipeline

Extract transcripts from a brand's channel, a competitor's channel, or a set of product-review videos. Pipe transcript_text into an NLP pipeline (spaCy, HuggingFace Transformers, OpenAI) for sentiment scoring, named entity extraction, topic modeling, or keyword frequency. Useful for brand monitoring and competitive intelligence.

Recommended settings: outputFormats: ["text", "llm"], subType: "both"

5. LLM training data collection

Curate domain-specific transcripts from niche YouTube channels (medical lectures, legal explainers, coding tutorials, scientific talks) to build fine-tuning datasets. The transcript_llm format strips filler tokens cleanly. Use metadata.tags and metadata.categories to filter and label the data.

Recommended settings: outputFormats: ["llm"], maxAiMinutes cap per run to control cost.

Turn a library of tutorials or vlogs into written content. Pass the transcript_llm field to an LLM prompt asking it to rewrite the transcript as a blog post, Twitter/X thread, newsletter section, or LinkedIn article. Combine with metadata.title, metadata.tags, and metadata.description for context.

Recommended settings: outputFormats: ["llm"], languages: ["en"]

7. Podcast / lecture transcription (no captions)

Podcasters who upload to YouTube and educators who post lecture recordings rarely add manual captions. Enable AI transcription and the actor transcribes them with faster-whisper. Use forceWhisperLanguage if you know the channel's language to skip the auto-detection window and reduce cost.

Recommended settings: enableAiFallback: true, forceWhisperLanguage: "en", skipAiFallbackIfLongerThan: 120 to skip anything over 2 hours.

8. Accessibility and caption quality audit

Compare YouTube's auto-generated captions (subType: "auto", is_auto_generated: true) against an AI transcription of the same video. Differences surface errors in the auto-generated track. Useful for accessibility compliance reviews or for creators who want to improve their caption quality before publishing.

Recommended settings: Two runs — one with subType: "auto" only, one with subType: "manual" and enableAiFallback: true to force AI fallback (since no manual captions exist, the actor falls back to AI transcription).

9. Academic research and citation analysis

Download a researcher's full lecture series, a conference talk archive, or all videos from an academic YouTube channel. Index the transcripts by speaker, date (metadata.upload_date), and topic. Use to find when specific terminology first appeared, how arguments evolved over time, or to build a citation graph for a literature review.

Recommended settings: outputFormats: ["json", "text"], maxResults: 1000, languages set to the channel's primary language.

10. Competitive intelligence monitoring (scheduled runs)

Schedule the actor to run weekly on a competitor's channel URL. Set maxResults: 5 to pull only the latest videos. Use an Apify webhook to POST the new transcripts to Slack, a CRM, or an internal dashboard. Automatically surface every new product announcement, feature mention, or pricing discussion your competitor publishes.

Recommended settings: maxResults: 5, outputFormats: ["llm"], paired with an Apify schedule and webhook.

Output examples

Native caption output

{
  "metadata": {
    "id": "dQw4w9WgXcQ",
    "title": "Rick Astley - Never Gonna Give You Up",
    "channel": "Rick Astley",
    "duration": 213,
    "view_count": 1757728410,
    "upload_date": "20091025"
  },
  "language": "en",
  "is_auto_generated": false,
  "is_ai_generated": false,
  "transcript_json": [
    { "start": 18.5, "end": 21.0, "text": "We're no strangers to love" },
    { "start": 21.0, "end": 24.5, "text": "You know the rules and so do I" },
    { "start": 24.5, "end": 28.0, "text": "A full commitment's what I'm thinking of" },
    { "start": 28.0, "end": 31.5, "text": "You wouldn't get this from any other guy" }
  ],
  "transcript_text": "We're no strangers to love You know the rules and so do I A full commitment's what I'm thinking of You wouldn't get this from any other guy ...",
  "transcript_llm": "We're no strangers to love You know the rules and so do I A full commitment's what I'm thinking of You wouldn't get this from any other guy ...",
  "transcript_srt": "1\n00:00:18,500 --> 00:00:21,000\nWe're no strangers to love\n\n2\n00:00:21,000 --> 00:00:24,500\nYou know the rules and so do I\n"
}

AI transcription output

When a video has no captions and enableAiFallback is on, AI transcription runs:

{
  "metadata": {
    "title": "Deep Work Podcast - Episode 12",
    "channel": "Cal Newport",
    "duration": 1847,
    "upload_date": "20240315"
  },
  "is_ai_generated": true,
  "language": "en",
  "language_probability": 0.9991,
  "language_was_forced": false,
  "ai_duration_charged_min": 31,
  "transcript_json": [
    { "start": 0.0,  "end": 4.1,  "text": "Welcome back to Deep Work. I'm Cal Newport." },
    { "start": 4.1,  "end": 9.3,  "text": "Today we're talking about why single-tasking is a competitive advantage in 2024." },
    { "start": 9.3,  "end": 14.8, "text": "The research here is pretty clear, and I think most people are leaving a lot on the table." }
  ],
  "transcript_text": "Welcome back to Deep Work. I'm Cal Newport. Today we're talking about why single-tasking is a competitive advantage in 2024. ...",
  "transcript_srt": "1\n00:00:00,000 --> 00:00:04,100\nWelcome back to Deep Work. I'm Cal Newport.\n\n2\n00:00:04,100 --> 00:00:09,300\nToday we're talking about why single-tasking\nis a competitive advantage in 2024.\n"
}

Error item

{
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "metadata": {
    "title": "Rick Astley - Never Gonna Give You Up",
    "duration": 213
  },
  "error": "No subtitles found in requested languages.",
  "error_code": "LANGUAGE_NOT_FOUND",
  "available_languages": ["en", "es", "fr", "de", "pt", "ja"]
}

Error codes:

Error items are never billed — you are only charged for successful transcripts. The table below groups errors by cause so you know whether the issue is in your input or something outside your control.

Input errors — caused by the URLs or settings you provided:

Error code	Meaning	What to do
`AGE_RESTRICTED`	YouTube requires sign-in / age verification to access this video.	Remove the URL — cannot be bypassed.
`PRIVATE_OR_UNAVAILABLE`	The video is private, deleted, or blocked in the runner's region.	Remove the URL or check if the video is public.
`LIVE_VIDEO`	Live streams have no static captions to extract.	Wait until the stream ends, then retry.
`LANGUAGE_NOT_FOUND`	Captions exist but not in the requested language. `available_languages` shows what's available.	Change your `languages` input.

Budget / limit errors — the video could be transcribed, but a budget gate prevented it:

Error code	Meaning	What to do
`NO_CAPTIONS_AVAILABLE`	The video has no captions and AI transcription is turned off.	Turn on Enable AI transcription in the input form and re-run, or set `enableAiFallback: true` via API.
`AI_MINUTES_LIMIT_REACHED`	The `maxAiMinutes` budget for this run is exhausted.	Increase `maxAiMinutes` and retry.
`AI_FALLBACK_SKIPPED_TOO_LONG`	The video exceeds the `skipAiFallbackIfLongerThan` duration limit.	Increase or remove the limit.
`SPENDING_LIMIT_REACHED`	The Apify account spending limit was hit — no further AI charges possible.	Adjust your Apify billing settings.

Infrastructure / actor errors — not caused by your input; no charge is made:

Error code	Meaning	What to do
`BOT_DETECTION`	YouTube challenged the request. The actor retried through proxy tiers automatically.	Usually self-resolving. Switch proxy group if persistent.
`EXTRACTION_ERROR`	Generic yt-dlp failure — the video may be temporarily unavailable on YouTube's side.	Retry later.
`AI_TRANSCRIPTION_FAILED`	The Whisper model or audio download failed for this video.	Check run logs; retry.
`UNEXPECTED_ERROR`	An unhandled exception in the actor code. The video gets an error item; other videos continue.	Open an issue if persistent.

Pricing

Pay-per-result — you are charged per transcript extracted, not per minute of server time. Each run has a one-time startup fee, then a flat per-transcript charge. AI transcription adds a per-minute charge on top, and only runs on videos that have no native captions.

For the full rate card by subscription plan, see the Pricing page.

Proxy costs are separate from transcript charges. The default datacenter proxy costs nothing on clean runs — it is only used as a fallback when YouTube challenges a request. If the datacenter tier is also challenged, the actor auto-escalates to residential (~$0.40/GB), though this is rare. See Proxy configuration for details.

Built for bulk runs

A few things keep costs low at scale:

One small startup fee, then pay per transcript. Each run has a fixed startup fee ($0.0025 with AI off, $0.010 with AI on). After that, you pay $0.001 per transcript — the same whether the run has 10 videos or 1,000.
No proxy cost on clean runs. Every request goes direct first. The proxy is only used as a silent fallback if YouTube challenges a specific request — and that happens rarely. Most runs pay $0 in proxy fees.
AI model only initialized when needed. Within an AI-enabled run, the transcription model is loaded into memory only when the first caption-free video is encountered. If every video in your batch has native captions, the model never occupies RAM.
Concurrent processing. Up to 5 videos are processed in parallel, reducing total run time for large playlists or channels.
Built-in spend controls. Max AI minutes per run and Skip AI for long videos let you set hard caps on AI spend before a run starts.
Failed videos never stall the run. Each failed video gets a dataset error item; the rest of the batch continues without interruption.

How it compares

Feature	This actor	Typical alternatives
Transcribes videos with no captions	Yes — built-in AI, no external API key	No — returns an error
LLM-optimized output (filler stripped)	Yes — `transcript_llm` field	No
Spend safeguards (AI minute cap, skip long videos)	Yes	No
Native transcript price	$0.001 per transcript	Up to $0.005 — 5× more
No monthly subscription	Yes — pay only for what you run	Flat monthly fee
Batch: playlists and channels	Yes	Most
Output formats	JSON, Text, SRT, VTT, LLM	Usually JSON only
Word-level timestamps	Yes	Rare
YouTube Data API key required	No	No
Automatic access challenge bypass	Yes — retries via proxy when needed, direct otherwise	Varies
MCP-compatible (Claude Desktop, Cursor, etc.)	Yes — via Apify MCP integration	Rare

Try it free → first result in under 30 seconds.

Who uses it

AI and LLM developers

Every output item includes a ready-to-use transcript_llm field — filler tokens stripped, clean text, no post-processing. Batch a whole channel overnight via the API and wake up to a dataset ready for your retrieval pipeline or fine-tuning job.

Content creators and marketers

Turn any YouTube video into a blog post or newsletter draft without manual transcription. Extract pull quotes from interviews. Run an entire channel archive in one batch.

SEO professionals and researchers

Extract keyword data from video transcripts at scale. Build text content from video transcripts for search. Analyse a competitor's spoken messaging for topic and positioning gaps.

Data scientists and academics

Build NLP corpora from lectures, conference talks, and documentary interviews. Process multilingual transcripts for cross-language analysis. Run large dataset collection jobs overnight via the API.

Developers building MCP-integrated AI tools

Connect this actor as an MCP server so Claude Desktop, Claude.ai Projects, Cursor, or any MCP-compatible client can fetch and process YouTube transcripts in a single tool call. No copy-pasting, no API wiring — just hand the model a URL.

Integration examples

Python — Apify client

Get your API token from the Apify Console under Settings → Integrations. Keep it secret — treat it like a password.

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("codepoetry/youtube-transcript-ai-scraper").call(
    run_input={
        "startUrls": [{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],
        "languages": ["en"],
        "outputFormats": ["json", "llm"],
    }
)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["metadata"]["title"])
    print(item["transcript_text"][:200])

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('codepoetry/youtube-transcript-ai-scraper').call({
    startUrls: [{ url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ' }],
    languages: ['en'],
    outputFormats: ['json', 'llm'],
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => console.log(item.metadata.title, item.transcript_text.slice(0, 200)));

LangChain / RAG pipeline

from apify_client import ApifyClient
from langchain.docstore.document import Document

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("codepoetry/youtube-transcript-ai-scraper").call(
    run_input={
        "startUrls": [{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],
        "outputFormats": ["llm"],
        "maxAiMinutes": 60,
    }
)

docs = [
    Document(
        page_content=item["transcript_llm"],
        metadata={"source": item["metadata"]["url"], "title": item["metadata"]["title"]},
    )
    for item in client.dataset(run["defaultDatasetId"]).iterate_items()
    if "transcript_llm" in item
]
# docs is ready for any LangChain vector store or retriever

Run on a schedule or trigger a webhook

To run this actor on a schedule or receive a webhook notification when a run finishes, use the Schedules and Integrations tabs on the actor's page in the Apify Console. See the Apify scheduling docs and webhook docs for setup instructions.

Advanced options

All options can be set in the Input form or passed as JSON when calling via the API.

Option	UI label	Default	When to use
`enableAiFallback`	Enable AI transcription	`false`	When off (default), videos with no native captions are skipped. Enable to fall back to AI transcription for those videos — memory raises to 4 GB automatically.
`maxResults`	Max videos per playlist/channel	`10`	Cap how many videos are fetched from a playlist or channel. Single video URLs ignore this.
`languages`	Caption languages	`["en"]`	Preferred caption languages in order of priority. First match on the video is used. Pick from the dropdown (130+ languages) or pass any ISO 639-1 code via the API.
`subType`	Caption source	`"both"`	`"manual"` = human captions only · `"auto"` = auto-generated only · `"both"` = prefer manual, fall back to auto
`outputFormats`	Output formats	json, text, llm	Which transcript formats to write to the dataset.
`wordLevel`	Word-level timestamps	`false`	Add per-word timestamps to JSON segments. Safe to enable for any video — has no effect on manual captions.
`maxAiMinutes`	Max AI minutes	`30`	Hard cap on AI transcription minutes per run. Set to `0` for unlimited. Recommended when processing unknown playlists.
`skipAiFallbackIfLongerThan`	Skip AI for long videos (minutes)	`0` (off)	Skip AI for videos exceeding N minutes. Avoids unexpected costs from long videos.
`forceWhisperLanguage`	AI transcription language	auto-detect	Force AI to a specific language. Pick from the dropdown (99 supported languages) or pass any code via the API. Skips 30-second detection window, saves ~20% per video.

Proxy configuration

Proxy is always active and fully automatic — no configuration needed. Every request goes direct first, and the proxy is only used if YouTube challenges the request. This costs nothing on a clean run.

How it works

Occasionally YouTube challenges automated requests. When this happens the actor automatically escalates through progressively stronger proxy tiers until the request succeeds:

Direct request (no proxy) — used first for every video. Zero cost.
Datacenter proxy — fast and free on most plans. Handles the vast majority of challenges.
Residential proxy — highest trust with YouTube. Used only if the datacenter tier is also challenged.

The escalation is fully automatic — you do not need to configure anything. If all tiers are exhausted, the affected video is marked with a BOT_DETECTION error code and the actor continues with the remaining videos.

Proxy costs

Type	Cost	Notes
Datacenter (Apify)	Free on most plans	Default first tier. Zero bandwidth consumed on clean runs.
Residential (Apify)	~$0.40 / GB	Auto-escalation tier. Only consumed if datacenter proxy is also challenged — rare.

Proxy costs are billed from your Apify account balance as a separate line item, alongside this actor's Pay-Per-Event charges. On a typical run with no bot challenges: $0 proxy cost. If datacenter retry is needed: approximately 0.5 MB per affected video. Residential is only consumed if datacenter also fails — this is rare and keeps costs minimal even in bulk runs.

Limitations

All non-recoverable failures produce a dataset item with an error_code field. See the Error codes table for the full reference.

Constraints:

No translation — the actor returns the original spoken language only.
YouTube may rate-limit very large batches (100+ videos). The automatic proxy escalation handles most cases transparently.
maxResults default of 10 is intentionally conservative — increase it for large playlists or full channel archives.

Memory

Memory is set automatically based on whether Enable AI transcription is on or off — no manual configuration needed for most runs.

AI off (default): 1 GB allocated automatically. Native caption runs use under 400 MB — the lowest-cost configuration.
AI on: 4 GB allocated automatically. Required for the Whisper model (~1.5 GB) plus concurrent AI jobs.

If you manually override memory below 2 GB while AI is on, the run fails immediately with a clear message explaining what to change.

AI transcription speed: processes at roughly 8× real-time. A 30-minute podcast takes about 4 minutes; a 2-hour lecture takes about 15 minutes.

How to override memory manually (API users)

If you need to override the automatic setting — for example, forcing 1 GB on an AI-enabled run where you are certain all videos have captions:

Open the Actor in the Apify Console.
Click the Input tab, then click the ⚙ Settings button (top right of the input form).
Find the Memory field and enter your preferred value (e.g. 1024 for 1 GB).
Click Save — the setting is saved with your input and used on every future run.

If you run via API, pass "memoryMbytes": 1024 in the run options alongside your runInput. Setting below 2048 while enableAiFallback: true causes an immediate run failure.

Frequently asked questions

How much does one video cost?

A single video with captions costs approximately $0.0035 on the Free plan with AI off (1 GB memory) — $0.001 for the transcript plus a $0.0025 startup fee per run. With AI on (4 GB), the startup fee is $0.010, making the first video $0.011. A second video in the same run adds just $0.001 regardless of memory. AI transcription adds $0.012 per minute of audio on the Free plan.

What happens if a video has no captions?

By default the video is skipped with a NO_CAPTIONS_AVAILABLE error. Enable AI transcription (enableAiFallback: true) and the actor downloads the audio and transcribes it — output has the same structure as native captions, with is_ai_generated: true. If the maxAiMinutes cap is reached, remaining caption-free videos receive an AI_MINUTES_LIMIT_REACHED error and the run continues. The available_languages field lists caption codes YouTube does provide on that video.

Does it work for playlists and channels?

Yes. Paste a playlist or channel URL and the actor expands it into individual videos automatically. Use maxResults to cap how many are fetched. If one video is private, age-restricted, or unavailable, it gets an error item while the rest continue.

What languages are supported?

Native captions: Any language YouTube provides captions for — typically 100+ languages for auto-generated captions. Pass multiple language codes (e.g. ["en", "es"]) to fall back automatically when your first choice is unavailable.

AI transcription: 99 languages, including English, Spanish, French, German, Portuguese, Japanese, Chinese, Arabic, and Hindi.

What output formats are available?

JSON — timestamped segments as an array [{start, end, text}, ...]
Text — plain text joined from all segments
LLM — text with [Music], (laughter), and other filler tokens stripped, ready for AI pipelines
SRT — standard subtitle format for video players and editing software
VTT — WebVTT format for HTML5 <video> elements

Multiple formats can be requested in a single run.

Can I set a spending limit?

Yes. Max AI minutes per run caps total AI-transcribed minutes per run (default: 30). Skip AI for long videos skips videos exceeding a duration threshold automatically. Set the AI minutes cap to 0 for unlimited AI transcription.

How accurate is AI transcription?

Accurate for clear speech in widely spoken languages. Accuracy degrades with heavy accents, domain-specific jargon, or poor audio quality. The language_probability field indicates the model's confidence in the detected language. For quality-critical work, treat AI transcripts as a first draft and review them.

What is a YouTube transcript scraper?

A YouTube transcript scraper extracts the spoken text from YouTube videos — converting a YouTube video to text without any manual work. This actor retrieves captions when YouTube provides them, or generates a transcript from the audio when captions are unavailable.

Can I use this to convert a YouTube video to text?

Yes — that is exactly what it does. Paste the video URL, click Start, and the actor returns the spoken words as plain text (or JSON, SRT, VTT, or LLM-ready format). For videos with no captions, enable AI transcription to generate the text from the audio.

Does this translate transcripts?

No. The actor returns the original spoken language. Use a separate translation service for translation.

Does it work for YouTube Shorts?

Yes. Shorts use the same caption infrastructure as regular videos.

Do I need a YouTube Data API key?

No. The actor accesses publicly available caption data without any YouTube API credentials.

How does this compare to the YouTube Data API?

The YouTube Data API v3 does not provide transcript data. It requires a Google Cloud project, OAuth credentials, and per-day quotas. This actor requires none of that.

How does this compare to the `youtube-transcript-api` Python library?

The youtube-transcript-api library is fine for a handful of videos in your own Python script. This actor adds cloud infrastructure, batch processing across playlists and channels, AI transcription for caption-free videos, multiple output formats, scheduling, and Apify platform integrations (webhooks, REST API, n8n, Make, Zapier).

What do the run log messages mean?

Open the Log tab on any completed run to see what the actor did. Here are the messages you may encounter:

Normal progress — no action needed:

Message	What it means
`Starting — memory: 4 GB · AI transcription: enabled · max 30 min/run · languages: en`	Run configuration summary at startup
`Found N videos to process.`	How many videos were found and will be processed
`Fetching video list from: https://...`	Expanding a playlist or channel into individual video URLs
`Processing: https://...`	Processing started for one video
`✓ Saved: "Video Title" (en, auto-generated captions)`	Transcript saved successfully — native captions found
`Loading AI transcription model...`	AI model is loading; this happens once per run when the first caption-free video is encountered
`AI transcription model ready.`	Model loaded; subsequent AI jobs reuse it with no delay
`AI transcription: up to N job(s) at a time (N GB memory)`	How many AI jobs run in parallel — 1 at <4 GB, 2 at 4 GB
`AI transcription language: forced to 'en'`	Language set by your `forceWhisperLanguage` input
`AI transcription language: auto-detecting from audio`	Language will be detected from the first 30 seconds of audio
`No captions found — starting AI transcription for "..."`	No native captions found; AI transcription is starting for this video
`Downloading audio: https://...`	Downloading audio for a caption-free video
`Transcribing audio (~N min at 8x real-time)...`	AI model is processing the audio
`✓ AI transcription complete — language: en (confidence: 99%)`	AI transcription saved successfully
`✓ Done: N/M transcripts saved · X AI-min used`	Run completed — see Dataset tab for results

Handled automatically — no action needed:

Message	What it means
`YouTube rate-limited this request — switching to proxy (attempt N/M)...`	YouTube blocked a request; actor is retrying via proxy automatically
`Caption download blocked for '...' — retrying via proxy (attempt N/M)...`	Caption download was blocked; retrying via proxy automatically
`YouTube rate-limited the audio download — retrying via proxy (attempt N/M)...`	Audio download was blocked; retrying via proxy automatically
`No videos found at ... (possible rate-limit) — retrying via proxy...`	Playlist/channel expansion was blocked; retrying via proxy

Warnings — action may be needed:

Message	What it means	What to do
`No videos found. Check your input URLs and try again.`	No videos were found from any of the provided URLs	Check that playlists and channels are public; verify the URLs are correct
`Skipping AI for "Title": needs N min but only Y min of AI budget remain...`	The `maxAiMinutes` cap was nearly exhausted	Increase Max AI Minutes in actor settings and re-run
`YouTube blocked access to ... and no proxy is configured.`	YouTube blocked the request and proxy could not be set up	Check Apify proxy service status; retry later

Errors — check the dataset error_code field for the affected video:

Message	What it means	What to do
`Audio download failed for ... after trying N proxy tier(s)...`	All retries exhausted for audio download	Try re-running later; reduce batch size if persistent
`AI transcription failed for "Title": <error>`	Whisper model error for this video	Check error detail; try re-running the affected video
`Apify spending limit reached — no further charges possible...`	Account spending limit was hit mid-run	Go to Apify Console → Settings → Billing to adjust your limit
`Unexpected error processing ...: <error>`	Unhandled exception — video gets `UNEXPECTED_ERROR` in dataset	Open an issue on the actor's Issues tab if this recurs

Every error item in the dataset has an error_code field — filter by that rather than parsing log text. See Error codes for the full reference table.

Is it legal to extract YouTube transcripts?

YouTube's Terms of Service prohibit automated scraping, and you are responsible for complying with their Terms and applicable law in your jurisdiction. This actor accesses only publicly available caption data — the same data visible when you click "Open transcript" in the YouTube player. It does not bypass any authentication, access private content, or collect personal user data. See Apify's web scraping legality guide for a broader overview.

Language reference

Supported languages

Native captions: Any language YouTube provides — 130+ codes. Use the Caption languages dropdown to pick from the full list, or pass any ISO 639-1 code directly via the API.

AI transcription: 99 languages. Use the AI transcription language dropdown to pick from all supported codes, or see the full list in the faster-whisper tokenizer. Leave blank to auto-detect.

When a requested language is not available on a video, the actor returns a LANGUAGE_NOT_FOUND or NO_CAPTIONS_AVAILABLE error item with an available_languages field listing the codes that are actually present.

Get started

Click Try for free at the top of this page. The Rick Astley demo URL is pre-filled — run it in one click to see the full output structure.

Questions or bugs? Use the Issues tab on this actor's page — response time is typically within 24 hours.

About this actor

This actor runs on the Apify platform. AI transcription uses faster-whisper (MIT license), bundled into the Docker image so there is no model download delay on first run.

YouTube Transcript API

novi/youtube-transcript-api

Need to grab the words from YouTube videos? YouTube Transcript API is here to help! It's easy to use and gets the job done, plus it gives you some extra info too. Go ahead and try it!

Novi

Youtube Transcript Scraper

topaz_sharingan/Youtube-Transcript-Scraper

Are you in search of a robust solution for extracting transcripts from YouTube videos? Look no further 😉, YouTube-Transcript-Scraper will meet your needs. Our software not only efficiently retrieves transcripts but also provides additional valuable information .👍 😀 Scrap away 🕵‍♂️.

Moses Ceaser

4.6K

4.9

Youtube To Mp3

ssyoutube/youtube-to-mp3

Fast And Latest Youtube To Mp3 Converter API Solution.

SS Youtube

Youtube Transcript Scraper

pintostudio/youtube-transcript-scraper

Looking for a reliable way to extract transcripts from YouTube videos? 🎥✨ Look no further! The YouTube-Transcript-Scraper has you covered. 🚀 It effortlessly retrieves transcripts while offering additional valuable insights. Ready to start? Let’s scrape away! 🕵️‍♂️💻

Pinto Studio

21K

4.8

YouTube Transcript

agentx/youtube-transcript

YouTube transcript API — pass any YouTube URL (regular video, Shorts, or live replay) and get back the full transcript as timestamped segments, using the official caption track when present and ASR when not. Optional one-shot translation into any of 100+ languages.

AgentX

414

4.7

YouTube Comment Scraper | Extract YT Video Comments Data

code-node-tools/youtube-comments-scraper

Scrape YouTube comments effortlessly with our YT comments scraper. Extract comments from videos, channels & playlists with sentiment analysis, advanced filters & export options. This YouTube video comment scraper bypasses API limits. Ideal for researchers, marketers & data analysts. Start scraping!

CodeNodeTools

4.2

🔥YouTube Video Heatmap Scraper 🔍

scrapearchitect/youtube-video-heatmap-scraper

🔥Extract viewer engagement hotspots from any YouTube video! 🎥 🕒 2.48-second segments with 0-1 intensity scores 🌡️ Spot peak moments & drop-offs instantly 📈 Export-ready for charts, AI, or competitor analysis 🚀 Lightning-fast scraping, no API keys needed. 🔥YouTube Video Heatmap Scraper 🔍

Scrape Architect

Youtube Highlights Hooks Analyzer

coregent/youtube-highlights-hooks-analyzer

Advanced YouTube analytics that extracts chapters, intro pacing, and hook suggestions for editors and creators. Analyze Shorts and long videos to find viral moments, engagement patterns, and optimal clip timestamps with an API-first design for blazing-fast performance.

Delowar Munna

5.0

YouTube Transcript Scraper & Captions

harvestlab/youtube-scraper

YouTube transcript scraper and transcript API alternative for public videos, channels, and searches. Export captions, subtitles, timestamped segments, metadata, comments, AI chapters, and MCP connector summaries.

Nick

YouTube Video Transcript

starvibe/youtube-video-transcript

This Apify Actor extracts full transcripts (with timestamps) and metadata from YouTube videos, including title, description, upload date, views, likes, channel info, and duration

starvibe

7.5K

5.0

YouTube Transcript Scraper Pro (Captions + AI Fallback)

YouTube Transcript Scraper — Captions + AI Speech-to-Text

Quick start

How it works

What you get

Video metadata

Transcript fields

Use cases

1. Claude Desktop / Claude.ai MCP integration

2. YouTube Shorts — per-word karaoke captions

3. Build a searchable knowledge base (RAG)

4. NLP / sentiment analysis pipeline

5. LLM training data collection

6. SEO content repurposing

7. Podcast / lecture transcription (no captions)

8. Accessibility and caption quality audit

9. Academic research and citation analysis

10. Competitive intelligence monitoring (scheduled runs)

Output examples

Native caption output

AI transcription output

Error item

Pricing

Built for bulk runs

How it compares

Who uses it

AI and LLM developers

Content creators and marketers

SEO professionals and researchers

Data scientists and academics

Developers building MCP-integrated AI tools

Integration examples

Python — Apify client

JavaScript / Node.js

LangChain / RAG pipeline

Run on a schedule or trigger a webhook

Advanced options

Proxy configuration

How it works

Proxy costs

Limitations

Memory

How to override memory manually (API users)

Frequently asked questions

How much does one video cost?

What happens if a video has no captions?

Does it work for playlists and channels?

What languages are supported?

What output formats are available?

Can I set a spending limit?

How accurate is AI transcription?

What is a YouTube transcript scraper?

Can I use this to convert a YouTube video to text?

Does this translate transcripts?

Does it work for YouTube Shorts?

Do I need a YouTube Data API key?

How does this compare to the YouTube Data API?

How does this compare to the youtube-transcript-api Python library?

What do the run log messages mean?

Is it legal to extract YouTube transcripts?

Language reference

Supported languages

Get started

About this actor

You might also like

YouTube Transcript API

Youtube Transcript Scraper

Youtube To Mp3

Youtube Transcript Scraper

YouTube Transcript

YouTube Comment Scraper | Extract YT Video Comments Data

🔥YouTube Video Heatmap Scraper 🔍

Youtube Highlights Hooks Analyzer

YouTube Transcript Scraper & Captions

YouTube Video Transcript

How does this compare to the `youtube-transcript-api` Python library?