YouTube Transcript Scraper
Pricing
from $3.00 / 1,000 transcripts
YouTube Transcript Scraper
Extract timestamped YouTube transcripts as JSON segments. Single video or batch up to 50 IDs per run. Returns transcript + video metadata (title, channel, duration). Pay only for successful results, failed videos go to a separate errors dataset with machine-readable error codes.
Pricing
from $3.00 / 1,000 transcripts
Rating
0.0
(0)
Developer
apihq
Maintained by CommunityActor stats
0
Bookmarked
20
Total users
12
Monthly active users
16 days ago
Last modified
Categories
Share
Extract time-coded transcripts from any public YouTube video. No Google API key, no quota, no browser automation. Send one video ID or a batch of up to 50, get clean JSON segments with metadata back. Pay-per-result at $3 per 1,000 successful transcripts. Failed videos cost nothing.
Designed for developers wiring transcripts into AI training data, search indexing, content analysis, accessibility tools, and moderation pipelines.
Why this Actor
Three properties competing transcript scrapers usually skip:
- You pay only for successful extractions. Failed videos go to a separate dataset called
errorsand never trigger a charge. - Every failure carries a machine-readable code (
NO_CAPTIONS,PLAYABILITY,EMPTY_TRANSCRIPT,VALIDATION_FAILED,DEADLINE_EXCEEDED) so your downstream pipeline can branch on the error type without parsing English text. - A 30-second deadline applies to every request. If the upstream is slow, you get a clean 504 and a refund, not a 60-minute hang that drains your budget.
Who uses this
- AI engineers feed transcripts into RAG pipelines so LLMs can answer questions about video content.
- Search teams index spoken words to surface results that title-and-description search misses.
- Podcast and creator tools pull quotes, generate chapters, and draft summaries from interview audio.
- Analytics and brand-safety teams run sentiment, topic, and keyword extraction across thousands of videos.
- Accessibility and edtech builders drop subtitle overlays into learning platforms and media players.
- Researchers build NLP corpora from lectures, conference talks, and long-form interviews.
Input example
A single video with metadata included:
{"videoId": "jNQXAC9IVRw","metadata": true}
A batch of three videos in a specific language:
{"videoIds": ["jNQXAC9IVRw","dQw4w9WgXcQ","kJQP7kiw5Fk"],"language": "en","metadata": true}
The Actor caps batch input at 50 video IDs per run. For larger workloads, launch parallel runs.
Output example
A successful extraction in the default dataset (this is the actual response for "Me at the zoo", the first ever YouTube upload):
{"video_id": "jNQXAC9IVRw","language": "en","transcript": [{"text": "All right, so here we are, in front of the elephants","start": 1.2,"duration": 2.16},{"text": "the cool thing about these guys is that they have really really really long trunks","start": 5.318,"duration": 7.298},{"text": "and that's pretty much all there is to say","start": 16.881,"duration": 2}],"is_auto_generated": false,"available_tracks": [{ "language_code": "en", "language_name": "English", "is_auto_generated": false },{ "language_code": "de", "language_name": "German", "is_auto_generated": false }],"metadata": {"title": "Me at the zoo","author_name": "jawed","author_url": "https://www.youtube.com/channel/UC4QobU6STFB0P71PMvOGN5A","thumbnail_url": "https://i.ytimg.com/vi/jNQXAC9IVRw/hqdefault.jpg","channel_id": "UC4QobU6STFB0P71PMvOGN5A","duration_seconds": 19}}
A failed extraction lands in the errors dataset:
{"video_id": "00000000000","code": "PLAYABILITY","error": "Video 00000000000 is not playable: This video is unavailable","request_id": "req_911f37d2e55644ff9d9e4a3f","status_code": 403}
You can quote the request_id when filing a support issue and we can trace the exact request in our logs.
What you get
For each successfully extracted video, one record lands in the default dataset with this shape:
| Field | Type | What it is |
|---|---|---|
video_id | string | The 11-character YouTube video ID. |
language | string | Language code of the selected caption track. |
transcript | array | Time-coded caption segments. Each has text, start, and duration. |
is_auto_generated | boolean | True if the selected track is YouTube's ASR. |
available_tracks | array | Every caption track the video exposes, with language code and name. |
metadata | object | Video title, channel name, channel URL, thumbnail URL, channel ID, duration in seconds. Present when metadata: true (default). |
Failed videos do not pollute the default dataset. They go to a per-run dataset called errors with video_id, the machine-readable code, a human-readable error message, the backing service's request_id for support tickets, and the HTTP status_code.
How to use this Actor
- Open the Actor in the Apify Console.
- Set
videoId(single) orvideoIds(up to 50 per run). Optionally passlanguage(e.g.en,de,en-US) andmetadata(defaults to true). - Click Start. Successful transcripts land in the default dataset; failures land in a separate
errorsdataset.
The Actor is also callable from the Apify API and every official integration (Make, Zapier, n8n, Slack, webhooks). The API tab in the Console has ready-to-paste JavaScript, Python, and curl snippets.
Pricing
Pay-per-result. Each successful transcript that lands in the default dataset costs $0.003, which works out to $3.00 per 1,000 transcripts. Failed videos cost nothing.
Apify subscriber tier discounts apply automatically. Customers on the Scale plan pay roughly $2.40 per 1,000, and Business plan customers pay roughly $1.80 per 1,000.
Platform compute is included in the per-event price. There is no separate compute bill or storage surprise.
You can cap the maximum charge of a single run from Apify's Run Limits panel. The Actor honors that cap. If your budget runs out mid-batch, the run stops cleanly and reports how many items completed.
Reliability
Three guarantees worth knowing about before you wire this into a production pipeline:
Hard 30-second deadline per video. If YouTube or the proxy network slows down, you get a 504 with code: DEADLINE_EXCEEDED for that video. You never wait minutes for a single result.
Empty transcripts are treated as failures. If YouTube returns a caption track that parses to zero segments, the record goes to the errors dataset and is not billed. You never pay for a blank transcript.
Stable error taxonomy. Every failure has a code your code can branch on. The codes are stable across releases. Common codes you might see:
NO_CAPTIONS: the video has no caption tracks at all.PLAYABILITY: the video is unavailable, deleted, or geo-blocked.EMPTY_TRANSCRIPT: tracks exist but parsed to zero usable segments.VALIDATION_FAILED: the video ID format was wrong.DEADLINE_EXCEEDED: the request hit the 30-second deadline.POTOKEN_REQUIRED: YouTube has enabled PoToken enforcement for this track.
Every error also carries a request_id that you can quote when filing a support issue.
FAQ
Do I need a YouTube Data API key?
No. The Actor calls YouTube's public InnerTube endpoint directly and does not consume Google API quota.
Which videos work?
Public videos with caption tracks. Manual captions and auto-generated (ASR) captions both work. Private, age-gated, or members-only videos do not, since the Actor only reads public data.
What if a video has no captions?
The video goes to the errors dataset with code: NO_CAPTIONS. You are not charged.
Which languages are supported?
Every language the video creator uploaded. If you pass a language code (such as en, de, or en-US), the Actor picks the best matching track. If you omit language, the Actor prefers manual English first, then auto-generated English, then the first available track.
Can I scrape a whole channel or playlist?
Not with this Actor. This one takes single video IDs or short batches. A separate Actor for full channel and playlist transcript extraction is on the roadmap. Watch this Actor's changelog for updates.
Can I use this Actor from my own code?
Yes. Use the Apify API or one of the official SDKs (Node.js: apify-client, Python: apify-client). The Apify Console shows ready-to-paste code samples on the API tab.
Why do failures go to a separate dataset?
Apify's pay-per-event billing fires on writes to the default dataset. Putting failures in a named dataset means the Actor cannot accidentally bill you for a failure. You still see exactly what failed and why, with a request_id for support correlation.
What happens if YouTube rate-limits the request?
The request retries up to two times with exponential backoff. If it still cannot fetch the transcript within the 30-second deadline, the video goes to the errors dataset with code: DEADLINE_EXCEEDED. You are not charged.
Roadmap
These features are planned for future Actors in the same family:
- Bulk channel transcript extractor: paste a channel URL, get every transcript in the channel.
- Bulk playlist transcript extractor: paste a playlist URL, get every transcript on the playlist.
- Single-video metadata Actor: title, channel, duration, and thumbnail only, priced cheaper than the full transcript.
- Caption track listing Actor: list available tracks without downloading them, useful for filtering before bulk runs.
Each will be a separate Actor, listed under the same publisher.
Found a bug or want a feature?
Open an issue on this Actor's Issues tab and include the request_id from any error record you saw. We respond within one business day.