YouTube Transcript Scraper
Pricing
from $2.50 / 1,000 results
YouTube Transcript Scraper
Extract YouTube captions, timestamps, SRT, VTT, and plain text from public videos in bulk without browser automation.
Pricing
from $2.50 / 1,000 results
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
YouTube Transcript Bulk API
Extract transcripts from public YouTube videos in bulk. The actor is built for AI pipelines, content repurposing, subtitle export, research, and searchable video archives.
What It Does
You provide YouTube video URLs or direct video IDs. The actor fetches the public YouTube watch page, reads available caption tracks, selects the best matching language, downloads the timed transcript XML, and returns one dataset row per video.
The launch implementation is HTTP-first and does not use browser automation. That keeps Apify hosting cost low and makes the pricing predictable.
Input
videoUrls: YouTube watch, Shorts, embed, live, or youtu.be URLs.videoIds: Direct 11-character YouTube video IDs.language: Preferred caption language such asenorja.includeAutoGenerated: Allows auto-generated captions when manual captions are not available.translationLanguage: Optional YouTube transcript translation target.outputFormat:json,text,srt, orvtt.maxVideos: Maximum videos to process.dryRun: Validate input and emit preview rows without fetching YouTube.
Output
Each video produces one row:
videoIdvideoUrlstatuslanguagesourceLanguageisAutoGeneratedsegmentCountfullTextsegmentsformattedTranscripterrorCodeerrorMessagescrapedAt
Unavailable captions, deleted videos, private videos, and request failures are returned as error rows instead of failing the full run. This follows Apify PPE best practice because the actor still performed work for that input.
Pricing
Recommended PPE launch target:
apify-actor-start: keep Apify default$0.00005.apify-default-dataset-item:$0.0025per transcript row.- Optional future enriched/translated event:
$0.008per enriched row.
The current cost model assumes HTTP requests, no browser, and no residential proxy. Publication should remain blocked if live cost probes show that residential proxy is required.
Limits
- Only public videos with public caption tracks are supported.
- Age-restricted, private, deleted, or captionless videos return an error row.
- YouTube may change its watch page payload shape. The canary should run daily against a known captioned video.
- Channel and playlist expansion is intentionally not part of v1. Add it only after transcript extraction has 30-day revenue signal.
Local Run
npm testnpm start
The default input.json uses dryRun: true so local startup does not depend on live YouTube access.