Youtube Transcript Scraper avatar

Youtube Transcript Scraper

Pricing

from $10.00 / 1,000 transcripts

Go to Apify Store
Youtube Transcript Scraper

Youtube Transcript Scraper

Extract transcripts and subtitles from any YouTube video. Returns clean full text plus timestamped segments, optional SRT/WebVTT subtitles, translation to any language, and video details. No API key, no rate limits.

Pricing

from $10.00 / 1,000 transcripts

Rating

0.0

(0)

Developer

Veronica

Veronica

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

an hour ago

Last modified

Share

YouTube Transcript Scraper extracts the full transcript, captions, and subtitles of any YouTube video — including Shorts and live replays — and returns them as clean, readable text, timestamped segments, and ready-to-use SRT/WebVTT subtitle files. Get the transcript in the video's original language automatically, or request a specific one from the video's available caption languages, together with video details (title, channel, thumbnail, views, duration). It runs in the cloud, needs no API key and no proxy setup, and exports to JSON, CSV, Excel, or XML.

Paste a YouTube link, click Start, and get structured transcript data back in seconds — or call the built-in real-time transcript API (Standby mode) to fetch captions on demand for your app, AI agent, or RAG / LLM pipeline.

What data does it extract?

DataDetails
📝 Full textthe whole transcript as one clean, continuous block of text — ready for summarizing, search, or NLP
⏱️ Timestamped segmentsevery caption line with its start time and duration in seconds
🎬 Subtitle filesthe transcript as ready-to-use SRT and/or WebVTT subtitles
🈳 Languagesthe transcript's language, whether it's auto-generated, and every other caption language available for the video
📺 Video detailstitle, channel name, channel URL, thumbnail, view count, duration, description, tags, and live-stream status
📊 Statsword count, character count, snippet count, and total duration

How to scrape YouTube transcripts

  1. Add a video URL — paste a full YouTube link (watch, youtu.be, shorts, embed, or live) or a bare 11-character video ID. One transcript is produced per run.
  2. Pick a language (optional) — leave it on Auto-detect to get the video's original language, or choose a specific language the video has captions in. If the video has no captions in the language you pick, the run reports an error listing the languages it does offer.
  3. Choose the output (optional) — include timestamped segments, add SRT/WebVTT subtitle files, and fetch video details.
  4. Click Start — then download the results from the Storage tab in any format, or push them anywhere with integrations.

Input example

{
"videoUrl": "https://www.youtube.com/watch?v=IELMSD2kdmk",
"transcriptLanguage": "auto",
"includeTimestamps": true,
"additionalFormats": ["srt"],
"includeVideoDetails": true
}

Output example (abridged)

{
"url": "https://www.youtube.com/watch?v=IELMSD2kdmk",
"videoId": "IELMSD2kdmk",
"title": "Apache Spark in 100 Seconds",
"channelName": "Fireship",
"channelUrl": "https://www.youtube.com/channel/UCsBjURrPoezykLs9EqgamOA",
"channelId": "UCsBjURrPoezykLs9EqgamOA",
"thumbnailUrl": "https://i.ytimg.com/vi_webp/IELMSD2kdmk/maxresdefault.webp",
"viewCount": 569973,
"language": "English (auto-generated)",
"languageCode": "en",
"isGenerated": true,
"durationSeconds": 199,
"snippetCount": 101,
"wordCount": 673,
"characterCount": 3826,
"text": "Apache spark an open- Source data analytics engine that can process massive streams of data from multiple sources…",
"segments": [
{ "text": "Apache spark an open- Source data", "start": 0.32, "duration": 4.08 },
{ "text": "analytics engine that can process", "start": 2.48, "duration": 3.839 }
],
"srt": "1\n00:00:00,320 --> 00:00:04,400\nApache spark an open- Source data\n\n2\n…",
"availableLanguages": [
{ "language": "English (auto-generated)", "languageCode": "en", "isGenerated": true }
],
"retrievedAt": "2026-06-16T12:00:00.000Z"
}

The transcript is always a real caption track from the video — language/languageCode show what you got and availableLanguages lists every option. If you request a language the video doesn't have, you get a NoTranscriptInLanguage error listing the languages it does offer.

If a transcript can't be retrieved, the result contains an error and a human-readable errorMessage instead (e.g. TranscriptsDisabled, VideoUnavailable, NoTranscriptFound, NoTranscriptInLanguage).

Real-time API (Standby mode)

Don't want to start a run and wait for it to finish? This Actor also runs in Standby mode — an always-on HTTP server that returns a transcript in a single request, like a normal REST API. It's ideal for apps and integrations that need transcripts on demand.

Send a GET request with the video and options as query parameters; the response body is the transcript as JSON (the same shape as a dataset item). Find your Standby base URL in the Actor's Standby tab and authenticate with your Apify API token (pass it as a token query parameter or an Authorization: Bearer <token> header).

$curl "https://<your-standby-url>/?token=<APIFY_TOKEN>&videoUrl=https://www.youtube.com/watch?v=IELMSD2kdmk&transcriptLanguage=en&additionalFormats=srt"

Query parameters

ParameterRequiredDefaultDescription
videoUrlyesA YouTube video URL or a bare 11-character video ID.
transcriptLanguagenoautoCaption language to return, or auto for the video's original language. Returns 404 NoTranscriptInLanguage if the video has no track in the chosen language.
includeTimestampsnotrueInclude the timestamped segments array.
additionalFormatsnoComma-separated subtitle formats to also include: srt, vtt.
includeVideoDetailsnotrueInclude video metadata (title, channel, thumbnail, views, duration).

You can also send the same parameters as a JSON body with POST. On success the response is 200 with the transcript; on failure it's a 4xx/5xx with an { "error", "errorMessage" } body (e.g. 404 NoTranscriptFound, 404 NoTranscriptInLanguage, 429 RateLimited).

Why scrape YouTube transcripts?

  • AI, RAG & LLMs — feed clean transcripts to an LLM to summarize, answer questions, generate show notes, or build retrieval-augmented generation (RAG) pipelines, embeddings, and semantic search over video content.
  • Content repurposing — turn videos into blog posts, newsletters, or social media threads.
  • SEO — publish transcripts to make video content indexable and searchable.
  • Research & analysis — collect and text-mine spoken content at scale.
  • Accessibility & localization — generate subtitles in any caption language the video provides.
  • Search archives — build a searchable database of everything said across a channel.

FAQ

Do I need an API key or login?

No. The Actor reads publicly available transcripts — just provide the video URLs.

Do I need to configure a proxy?

No. Reliable access to YouTube is handled for you automatically — there's nothing to set up or configure. Just provide a video and get the transcript back. It works out of the box.

Which languages can I get?

Leave transcriptLanguage on Auto-detect ("auto") to get the video's original language — the most reliable option, since it always returns a track the video actually has. To request a specific language, set transcriptLanguage to its code (e.g. "de"); a base language like en also matches regional tracks such as en-US, and human-made captions are preferred over auto-generated ones. The transcript is always a genuine YouTube caption track — the Actor doesn't machine-translate. If the video has no captions in the language you ask for, you get a NoTranscriptInLanguage error that lists the languages it does offer (also available as availableLanguages on every successful result).

Can I get subtitle files?

Yes. Add srt and/or vtt to additionalFormats and the item will include ready-to-use subtitle strings you can save as .srt/.vtt files.

Standard watch URLs, youtu.be short links, Shorts, embed and live URLs, mobile/music URLs, and bare 11-character video IDs.

Can I run it on a schedule?

Yes — use the Schedule option to run it on any cron expression, and pair it with webhooks or integrations to push fresh transcripts wherever you need them.

Can I call it like an API?

Yes. The Actor supports Standby mode, an always-on HTTP endpoint that returns a transcript per request with no run to wait for — see Real-time API (Standby mode) above.

Integrations

Connect YouTube Transcript Scraper to Google Sheets, Make, Zapier, n8n, Slack, Airbyte, or any tool via webhooks. Developers can start runs and fetch results programmatically with the Apify API using the JavaScript or Python clients — see the API tab for ready-made code.

Feedback and feature requests

Found a bug or missing a field you need? Open an issue in the Issues tab — feedback directly shapes what gets added next.