YouTube Transcript Scraper
Pricing
Pay per usage
YouTube Transcript Scraper
Extract transcripts (captions) from YouTube videos with timestamps. Supports manual and auto-generated captions in 50+ languages. Outputs JSON, plain text, or SRT format.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Alex Kim
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Extract full transcripts from YouTube videos — no API key, no login, no browser automation required.
Give it a list of YouTube URLs and get back structured transcripts with timestamps, plain text, or SRT subtitle files. Works with both manually uploaded captions and YouTube's auto-generated speech-to-text captions in 50+ languages.
Features
- No credentials needed — uses YouTube's internal player API directly
- 50+ languages — fetch transcripts in any available caption language with automatic fallback
- 3 output formats —
json(timestamped segments),text(plain full transcript),srt(subtitle file) - Video metadata — title, channel name, duration, and list of available languages included in every result
- Resilient — per-video error handling means one failed video never stops the rest
- Fast — lightweight HTTP-only approach, no headless browser overhead
Use Cases
- AI / LLM pipelines — feed video transcripts into summarizers, Q&A systems, or RAG pipelines
- Content research — bulk-extract transcripts for keyword analysis, topic modeling, or competitive research
- Subtitle generation — download SRT files for any video that has captions
- Accessibility tooling — programmatic access to captions for downstream processing
- Academic research — collect spoken-word data from YouTube at scale
How to Use
1. Prepare your input
Provide a list of YouTube video URLs. All common URL formats work:
{"videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ","https://youtu.be/Ks-_Mh1QhMc","https://www.youtube.com/shorts/abc123xyz01"],"language": "en","outputFormat": "json"}
2. Run the actor
Click Start in Apify Console, or use the API / CLI to trigger a run programmatically.
3. Collect results
Each video produces one dataset record. Download as JSON, CSV, or forward to external storage via integrations.
Input Reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
videoUrls | string[] | ✅ | — | List of YouTube video URLs to process |
language | string | ❌ | "en" | Preferred caption language code (e.g. "es", "zh-CN", "fr") |
outputFormat | "json" | "text" | "srt" | ❌ | "json" | Format for the transcript content |
Supported URL formats:
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://www.youtube.com/shorts/VIDEO_ID- Raw 11-character video ID
Output Reference
One record per video is written to the default dataset.
| Field | Type | Description |
|---|---|---|
url | string | The original input URL |
videoId | string | null | YouTube video ID |
title | string | null | Video title |
channelTitle | string | null | Channel name |
channelId | string | null | YouTube channel ID |
lengthSeconds | number | null | Video duration in seconds |
language | string | null | Language code of the returned transcript |
languageKind | "manual" | "asr" | null | Whether captions are manually uploaded or auto-generated |
availableLanguages | array | All caption languages available for this video |
segments | array | null | Timestamped segments (json format only) |
fullText | string | null | Complete transcript as a single string |
srt | string | null | SRT-formatted subtitle file content (srt format only) |
error | string | null | Error message if the video failed; null on success |
Example output record (outputFormat: "json")
{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","videoId": "dQw4w9WgXcQ","title": "Rick Astley - Never Gonna Give You Up (Official Music Video)","channelTitle": "Rick Astley","channelId": "UCuAXFkgsw1L7xaCfnd5JJOw","lengthSeconds": 212,"language": "en","languageKind": "manual","availableLanguages": [{ "code": "en", "name": "English", "kind": "manual" }],"segments": [{ "offset": 1360, "duration": 1680, "text": "We're no strangers to love" },{ "offset": 3040, "duration": 1600, "text": "You know the rules and so do I" }],"fullText": "We're no strangers to love You know the rules and so do I ...","srt": null,"error": null}
Language Selection & Fallback
If the requested language isn't available for a video, the actor automatically falls back in this order:
- Requested language — manual captions
- Requested language — auto-generated (ASR) captions
- English — manual captions
- English — auto-generated captions
- First available caption track
The language and languageKind fields in the output always reflect what was actually returned, not what was requested.
Error Handling
The actor never crashes due to a single video failure. Each video is processed independently:
| Situation | Behavior |
|---|---|
| Invalid or unrecognized URL | error: "Invalid YouTube URL — cannot extract video ID" |
| Video is private or deleted | error: "Video unavailable: ..." |
| No captions available | error: "No captions available for this video" |
| Network error after retries | error: "<error message>" |
Failed records have all content fields set to null while error contains the reason.
Compute Units & Pricing
This actor is lightweight — it makes 2 HTTP requests per video (player API + caption XML). There is no browser overhead.
Typical usage: ~0.001–0.005 compute units per video, depending on transcript length and proxy usage.
For a batch of 1,000 videos, expect to spend roughly $0.05–0.25 in platform costs.
FAQ
Does it work on private or age-restricted videos? No. The actor accesses YouTube's public player API without authentication. Private, members-only, or login-required videos will return an error.
What if a video has no captions at all?
The output record will have error: "No captions available for this video" and all transcript fields will be null. The actor continues processing remaining videos normally.
Can I use this with the Apify API or SDK? Yes. You can trigger runs via the Apify API, the JavaScript SDK, or Python SDK.
Is there a limit on the number of videos per run? No hard limit is enforced by the actor. Practical limits depend on your Apify plan's memory and timeout settings. For very large batches (10,000+), consider splitting across multiple runs.
Does it support YouTube playlists? Not currently. You need to provide individual video URLs. Playlist support may be added in a future version.
Related Actors
Looking for more YouTube data?
- YouTube Search Scraper — search YouTube by keyword and extract video metadata
- YouTube Channel Scraper — extract all videos from a YouTube channel