🎥 YouTube Transcript Scraper avatar

🎥 YouTube Transcript Scraper

Pricing

$3.00 / 1,000 results

Go to Apify Store
🎥 YouTube Transcript Scraper

🎥 YouTube Transcript Scraper

Extract YouTube transcript data — name, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Pricing

$3.00 / 1,000 results

Rating

0.0

(0)

Developer

Jackie Chen

Jackie Chen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

YouTube Transcript & Subtitle Scraper

youtube-transcript-scraper

Scrape YouTube transcripts and subtitles for any public video. Give one or more video IDs or URLs and get every available caption track — language, whether it is auto-generated (ASR) or human-authored, whether it is translatable, and the downloadable timedtext URL — together with the video's title, channel, length and view count.

Unofficial. This Actor is not affiliated with, authorized, or endorsed by YouTube or Google LLC. It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with YouTube's Terms of Service and all applicable laws; you are responsible for how you use the retrieved data.

What it does

  • Caption discovery — for each video, lists all caption tracks YouTube exposes (e.g. English, Spanish, auto-generated English), with the language code, a human-readable name, the kind (asr = auto-generated), isTranslatable, and the transcriptUrl (a YouTube timedtext URL).
  • Video metadata — every item also carries the parent video's videoTitle, channel, channelId, lengthSeconds, viewCount and shortDescription.
  • Filtering — keep only certain languages, or only auto-generated tracks.
  • Transcript text (best effort) — optionally tries to download and flatten the caption file into plain text. See the note below.

Input

FieldTypeDefaultDescription
videoIdsstring[]["dQw4w9WgXcQ"]Video IDs or full watch / youtu.be / shorts URLs.
languageCodesstring[][]Keep only tracks whose language code matches (e.g. en, es). Empty = all.
autoGeneratedOnlybooleanfalseKeep only ASR (auto-generated) tracks.
fetchTranscriptTextbooleanfalseAttempt to download the transcript text (best effort, see note).
maxItemsinteger50Max total caption tracks across all videos.

Example input

{
"videoIds": ["dQw4w9WgXcQ", "https://www.youtube.com/watch?v=jNQXAC9IVRw"],
"languageCodes": ["en"],
"autoGeneratedOnly": false,
"fetchTranscriptText": true,
"maxItems": 100
}

Output

One dataset item per caption track:

{
"videoId": "dQw4w9WgXcQ",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"videoTitle": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)",
"channel": "Rick Astley",
"channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
"lengthSeconds": 213,
"viewCount": 1779355962,
"languageCode": "en",
"language": "English",
"kind": "asr",
"isAutoGenerated": true,
"isTranslatable": true,
"vssId": ".en",
"transcriptUrl": "https://www.youtube.com/api/timedtext?v=dQw4w9WgXcQ&...",
"source": "video:dQw4w9WgXcQ"
}

Notes

  • Transcript text is best-effort. YouTube signs each timedtext URL against the IP that requested it, so a server-side download frequently returns an error. When fetchTranscriptText is enabled the Actor still tries, but transcriptText may come back empty. The transcriptUrl is always provided so you can fetch the caption file yourself (append &fmt=json3, &fmt=srv3, or &fmt=vtt) from the appropriate IP.
  • Data is sourced live; YouTube / the upstream edge occasionally rate-limits, so the Actor retries transient blocks with exponential backoff.
  • Video IDs are de-duplicated within a run.