YouTube Transcript Scraper avatar

YouTube Transcript Scraper

Under maintenance

Pricing

Pay per usage

Go to Apify Store
YouTube Transcript Scraper

YouTube Transcript Scraper

Under maintenance

Extract transcripts (captions) from YouTube videos with timestamps. Supports manual and auto-generated captions in 50+ languages. Outputs JSON, plain text, or SRT format.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alex Kim

Alex Kim

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

2 days ago

Last modified

Share

Extract full transcripts from YouTube videos — no API key, no login, no browser automation required.

Give it a list of YouTube URLs and get back structured transcripts with timestamps, plain text, or SRT subtitle files. Works with both manually uploaded captions and YouTube's auto-generated speech-to-text captions in 50+ languages.


Features

  • No credentials needed — uses YouTube's internal player API directly
  • 50+ languages — fetch transcripts in any available caption language with automatic fallback
  • 3 output formatsjson (timestamped segments), text (plain full transcript), srt (subtitle file)
  • Video metadata — title, channel name, duration, and list of available languages included in every result
  • Resilient — per-video error handling means one failed video never stops the rest
  • Fast — lightweight HTTP-only approach, no headless browser overhead

Use Cases

  • AI / LLM pipelines — feed video transcripts into summarizers, Q&A systems, or RAG pipelines
  • Content research — bulk-extract transcripts for keyword analysis, topic modeling, or competitive research
  • Subtitle generation — download SRT files for any video that has captions
  • Accessibility tooling — programmatic access to captions for downstream processing
  • Academic research — collect spoken-word data from YouTube at scale

How to Use

1. Prepare your input

Provide a list of YouTube video URLs. All common URL formats work:

{
"videoUrls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://youtu.be/Ks-_Mh1QhMc",
"https://www.youtube.com/shorts/abc123xyz01"
],
"language": "en",
"outputFormat": "json"
}

2. Run the actor

Click Start in Apify Console, or use the API / CLI to trigger a run programmatically.

3. Collect results

Each video produces one dataset record. Download as JSON, CSV, or forward to external storage via integrations.


Input Reference

FieldTypeRequiredDefaultDescription
videoUrlsstring[]List of YouTube video URLs to process
languagestring"en"Preferred caption language code (e.g. "es", "zh-CN", "fr")
outputFormat"json" | "text" | "srt""json"Format for the transcript content

Supported URL formats:

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://www.youtube.com/shorts/VIDEO_ID
  • Raw 11-character video ID

Output Reference

One record per video is written to the default dataset.

FieldTypeDescription
urlstringThe original input URL
videoIdstring | nullYouTube video ID
titlestring | nullVideo title
channelTitlestring | nullChannel name
channelIdstring | nullYouTube channel ID
lengthSecondsnumber | nullVideo duration in seconds
languagestring | nullLanguage code of the returned transcript
languageKind"manual" | "asr" | nullWhether captions are manually uploaded or auto-generated
availableLanguagesarrayAll caption languages available for this video
segmentsarray | nullTimestamped segments (json format only)
fullTextstring | nullComplete transcript as a single string
srtstring | nullSRT-formatted subtitle file content (srt format only)
errorstring | nullError message if the video failed; null on success

Example output record (outputFormat: "json")

{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"videoId": "dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up (Official Music Video)",
"channelTitle": "Rick Astley",
"channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
"lengthSeconds": 212,
"language": "en",
"languageKind": "manual",
"availableLanguages": [
{ "code": "en", "name": "English", "kind": "manual" }
],
"segments": [
{ "offset": 1360, "duration": 1680, "text": "We're no strangers to love" },
{ "offset": 3040, "duration": 1600, "text": "You know the rules and so do I" }
],
"fullText": "We're no strangers to love You know the rules and so do I ...",
"srt": null,
"error": null
}

Language Selection & Fallback

If the requested language isn't available for a video, the actor automatically falls back in this order:

  1. Requested language — manual captions
  2. Requested language — auto-generated (ASR) captions
  3. English — manual captions
  4. English — auto-generated captions
  5. First available caption track

The language and languageKind fields in the output always reflect what was actually returned, not what was requested.


Error Handling

The actor never crashes due to a single video failure. Each video is processed independently:

SituationBehavior
Invalid or unrecognized URLerror: "Invalid YouTube URL — cannot extract video ID"
Video is private or deletederror: "Video unavailable: ..."
No captions availableerror: "No captions available for this video"
Network error after retrieserror: "<error message>"

Failed records have all content fields set to null while error contains the reason.


Compute Units & Pricing

This actor is lightweight — it makes 2 HTTP requests per video (player API + caption XML). There is no browser overhead.

Typical usage: ~0.001–0.005 compute units per video, depending on transcript length and proxy usage.

For a batch of 1,000 videos, expect to spend roughly $0.05–0.25 in platform costs.


FAQ

Does it work on private or age-restricted videos? No. The actor accesses YouTube's public player API without authentication. Private, members-only, or login-required videos will return an error.

What if a video has no captions at all? The output record will have error: "No captions available for this video" and all transcript fields will be null. The actor continues processing remaining videos normally.

Can I use this with the Apify API or SDK? Yes. You can trigger runs via the Apify API, the JavaScript SDK, or Python SDK.

Is there a limit on the number of videos per run? No hard limit is enforced by the actor. Practical limits depend on your Apify plan's memory and timeout settings. For very large batches (10,000+), consider splitting across multiple runs.

Does it support YouTube playlists? Not currently. You need to provide individual video URLs. Playlist support may be added in a future version.


Looking for more YouTube data?

  • YouTube Search Scraper — search YouTube by keyword and extract video metadata
  • YouTube Channel Scraper — extract all videos from a YouTube channel