Youtube Transcript Scraper
Pricing
from $40.00 / 1,000 results
Youtube Transcript Scraper
Extract transcripts and captions from YouTube videos with language selection support. Returns timestamped segments, full concatenated text, and basic video metadata.
Pricing
from $40.00 / 1,000 results
Rating
5.0
(8)
Developer

Crawler Bros
Actor stats
9
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Extract transcripts and captions from YouTube videos with language selection support. Returns timestamped segments, full concatenated text, and basic video metadata.
What does it do?
This scraper extracts transcripts (subtitles/captions) from YouTube videos. It supports both manually created and auto-generated captions, with optional language selection and translation.
For each video, you get:
- Timestamped transcript segments (start time, duration, text)
- Full concatenated transcript as plain text
- Transcript language and type (manual vs auto-generated)
- List of all available transcript languages
- Basic video metadata (title, channel, views, duration, thumbnail)
Input
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
videoUrls | array | Yes | — | YouTube video URLs, short links, or plain video IDs |
language | string | No | "" (auto) | Preferred transcript language code (e.g., en, es, fr) |
includeAutoGenerated | boolean | No | true | Include auto-generated captions when manual ones aren't available |
Supported URL formats
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://www.youtube.com/shorts/VIDEO_IDhttps://www.youtube.com/embed/VIDEO_ID- Plain video ID (11 characters, e.g.,
dQw4w9WgXcQ)
Output
Each video produces one dataset item:
| Field | Type | Description |
|---|---|---|
video_id | string | YouTube video ID |
title | string | Video title |
channel_name | string | Channel name |
channel_id | string | Channel ID |
duration_seconds | integer | Video duration in seconds |
views | integer | View count |
published_date | string | Publish date (YYYY-MM-DD) |
thumbnail | string | Thumbnail URL |
transcript_language | string | Language code of the transcript |
transcript_language_name | string | Language name |
is_auto_generated | boolean | Whether the transcript is auto-generated |
available_languages | array | All available transcript languages |
segments | array | Timestamped transcript segments |
segment_count | integer | Number of segments |
full_text | string | Full transcript as plain text |
success | boolean | Whether the scrape succeeded |
error | string | Error message (null if successful) |
Segment format
{"start": "0.000","dur": "3.500","text": "We're no strangers to love"}
Available languages format
{"code": "en","name": "English","is_auto_generated": true}
Input examples
Basic usage
{"videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"]}
Multiple videos with language selection
{"videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ","https://youtu.be/9bZkp7q19f0"],"language": "en"}
Spanish transcript
{"videoUrls": ["dQw4w9WgXcQ"],"language": "es","includeAutoGenerated": true}
How it works
- For each video URL, extracts the video ID
- Fetches the transcript using YouTube's internal transcript API
- If a specific language is requested, tries to find it or translate to it
- Falls back to auto-generated captions if manual ones aren't available
- Fetches basic video metadata from the video page
- Returns everything as a structured dataset item
Limitations
- Some videos have transcripts/captions disabled by the creator
- Age-restricted videos may not be accessible
- Private or deleted videos cannot be scraped
- Auto-generated captions may contain errors
- Translation quality depends on YouTube's translation engine