Pricing

$10.00 / 1,000 results

🎥 YouTube Transcript Scraper

Extract YouTube transcript data — name, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Pricing

$10.00 / 1,000 results

Rating

0.0

(0)

Developer

Jackie Chen

Actor stats

Bookmarked

Total users

Monthly active users

16 days ago

Last modified

YouTube Transcript & Subtitle Scraper

youtube-transcript-scraper

Scrape YouTube transcripts and subtitles for any public video. Give one or more video IDs or URLs and get every available caption track — language, whether it is auto-generated (ASR) or human-authored, whether it is translatable, and the downloadable timedtext URL — together with the video's title, channel, length and view count.

Unofficial. This Actor is not affiliated with, authorized, or endorsed by YouTube or Google LLC. It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with YouTube's Terms of Service and all applicable laws; you are responsible for how you use the retrieved data.

What it does

Caption discovery — for each video, lists all caption tracks YouTube exposes (e.g. English, Spanish, auto-generated English), with the language code, a human-readable name, the kind (asr = auto-generated), isTranslatable, and the transcriptUrl (a YouTube timedtext URL).
Video metadata — every item also carries the parent video's videoTitle, channel, channelId, lengthSeconds, viewCount and shortDescription.
Filtering — keep only certain languages, or only auto-generated tracks.
Transcript text (best effort) — optionally tries to download and flatten the caption file into plain text. See the note below.

Input

Field	Type	Default	Description
`videoIds`	string[]	`["dQw4w9WgXcQ"]`	Video IDs or full watch / `youtu.be` / `shorts` URLs.
`languageCodes`	string[]	`[]`	Keep only tracks whose language code matches (e.g. `en`, `es`). Empty = all.
`autoGeneratedOnly`	boolean	`false`	Keep only ASR (auto-generated) tracks.
`fetchTranscriptText`	boolean	`false`	Attempt to download the transcript text (best effort, see note).
`maxItems`	integer	`50`	Max total caption tracks across all videos.

Example input

{
  "videoIds": ["dQw4w9WgXcQ", "https://www.youtube.com/watch?v=jNQXAC9IVRw"],
  "languageCodes": ["en"],
  "autoGeneratedOnly": false,
  "fetchTranscriptText": true,
  "maxItems": 100
}

Output

One dataset item per caption track:

{
  "videoId": "dQw4w9WgXcQ",
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "videoTitle": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)",
  "channel": "Rick Astley",
  "channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
  "lengthSeconds": 213,
  "viewCount": 1779355962,
  "languageCode": "en",
  "language": "English",
  "kind": "asr",
  "isAutoGenerated": true,
  "isTranslatable": true,
  "vssId": ".en",
  "transcriptUrl": "https://www.youtube.com/api/timedtext?v=dQw4w9WgXcQ&...",
  "source": "video:dQw4w9WgXcQ"
}

Notes

Transcript text is best-effort. YouTube signs each timedtext URL against the IP that requested it, so a server-side download frequently returns an error. When fetchTranscriptText is enabled the Actor still tries, but transcriptText may come back empty. The transcriptUrl is always provided so you can fetch the caption file yourself (append &fmt=json3, &fmt=srv3, or &fmt=vtt) from the appropriate IP.
Data is sourced live; YouTube / the upstream edge occasionally rate-limits, so the Actor retries transient blocks with exponential backoff.
Video IDs are de-duplicated within a run.

Quick start

Open the Actor and press Run — the default input works out of the box.
Adjust the input fields below to your target (keywords, IDs, or URLs) and set maxItems to cap spend.
Grab results from the Dataset tab as JSON / CSV / Excel, or pull them via the Apify API and MCP from your own code.

No proxies to configure, no cookies to paste, no login — the Actor handles everything server-side.

Why developers pick this transcript scraper

Transcript actors are the picks-and-shovels of the AI boom — and most charge $10 per 1,000 videos or quietly fail on half their runs. This Actor fetches YouTube transcripts/captions via a direct HTTP API at $3 per 1,000 videos, returned as timestamped segments plus a ready-to-use plain-text field. It's built for piping into LLMs: no HTML to clean, no SRT parsing, no browser.

What people build with it

RAG knowledge bases — index transcripts of conference talks, tutorials and reviews so your assistant can cite video content like documents.
Content repurposing — turn long-form videos into newsletters, blog posts and social threads with one LLM step on top of the transcript.
Competitor channel analysis — what topics, hooks and phrases do the top channels in your niche actually use? Transcripts answer at scale.
Compliance & moderation — audit what's being said in sponsored or branded videos without watching hours of footage.
Subtitle workflows — timestamped segments drop straight into translation and dubbing pipelines.
Research corpora — build searchable text datasets from playlists or whole channels.

Tips for better results

Works with standard video URLs, Shorts URLs, or bare video IDs.
Combine with YouTube Search or YouTube Channel Videos to discover videos first, then transcript them in bulk — a two-actor pipeline that turns any topic into a text corpus.
Each segment carries start and duration, so you can deep-link to the exact second a phrase is spoken (youtu.be/ID?t=123).

Why this Actor

Direct API, no headless browser — fast, stable runs with nothing to babysit.
No login, no cookies — we never touch your accounts, so there's no ban risk.
Fresh, real-time data — every run reads the source live, not a stale cache.
Pay per result — you're billed only for the rows actually delivered.
Structured JSON — export to CSV, Excel, or JSON, or pull straight from the API / MCP.

Use cases

Build clean text corpora for LLM fine-tuning and RAG.
Repurpose long video into blogs, summaries, and clips.
Make video searchable and translatable at scale.
Feed transcripts into topic modeling and keyword research.

FAQ

Do I need an account, cookies, or to log in anywhere? No. The Actor talks to a fast, direct HTTP API server-side — you just provide inputs and run it.

How am I billed? Pay-per-result: a fixed price per row returned, with no separate platform/compute charge. Caps like maxItems keep spend predictable.

Can I run it on a schedule or call it from my app? Yes — use Apify Schedules, the REST API, the JavaScript / Python clients, or the MCP server. See the API tab.

Is this affiliated with YouTube? No. It's an independent tool that collects publicly available data. Use it in line with the platform's terms and applicable law.

More YouTube scrapers by us

YouTube Search — Keyword video search · stats · channels
YouTube Channel Videos — All videos for a channel · stats
YouTube Channel Info — Channel profile · subs · about
YouTube Comments — Video comments + replies

Browse the full fleet → https://apify.com/ethereal_wool

YouTube Search Scraper | $3/1K Results

ethereal_wool/youtube-search-scraper

Extract YouTube search data — title, author, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Jackie Chen

🎥 YouTube Channel Videos Scraper

ethereal_wool/youtube-channel-videos-scraper

Extract YouTube channel videos data — title, author, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Jackie Chen

🎥 YouTube Channel Info Scraper

ethereal_wool/youtube-channel-info-scraper

Extract YouTube channel info data — title, author, engagement, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Jackie Chen

YouTube Comments Scraper | $3/1K Comments

ethereal_wool/youtube-comments-scraper

Extract YouTube comments data — author, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Jackie Chen

📸 Instagram Comments Scraper

ethereal_wool/instagram-comments-scraper

Extract Instagram comments data — text, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Jackie Chen

🎵 TikTok Comments Scraper

ethereal_wool/tiktok-comments

Extract TikTok comments data — text, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Jackie Chen