Youtube Transcript Scraper avatar

Youtube Transcript Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Youtube Transcript Scraper

Youtube Transcript Scraper

🎥 YouTube Transcript Scraper (youtube-transcript-scraper) extracts clean video transcripts & captions—timestamps, languages, and more. ⚡ Bulk scrape playlists/channels, export JSON/CSV for SEO, research, summarization & AI. 🔎 Perfect for repurposing and indexing.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapAPI

ScrapAPI

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Youtube Transcript Scraper

The Youtube Transcript Scraper is a fast, reliable YouTube subtitles extractor that pulls clean transcript text and timestamped captions from public videos — no manual copy-paste. It solves the hassle of gathering captions at scale by letting you paste multiple video URLs and automatically download YouTube transcripts into structured results. Built for marketers, developers, data analysts, and researchers, this YouTube transcript downloader helps you extract YouTube transcript text for SEO, research, summarization, and AI pipelines — and scale it across playlists and channels with ease.

What data / output can you get?

This actor streams structured items to the Apify dataset as each URL finishes. Depending on the outputFormat you choose, you can get plain text or detailed captions with timestamps.

Data typeDescriptionExample value
idYouTube video ID extracted from the URLdQw4w9WgXcQ
urlCanonical YouTube watch URL built from the IDhttps://www.youtube.com/watch?v=dQw4w9WgXcQ
inputThe original input URL you provided (watch or youtu.be)https://youtu.be/dQw4w9WgXcQ
transcripts[].languageCaption language label for the transcript variantEnglish (auto-generated)
transcripts[].content (text mode)Full transcript concatenated into a single stringNever gonna give you up, never gonna let you down...
transcripts[].content[].startMs (timestamp mode)Caption segment start time in milliseconds1250
transcripts[].content[].endMs (timestamp mode)Caption segment end time in milliseconds3620
transcripts[].content[].startTime (timestamp mode)Human-readable start time (mm:ss)0:01
transcripts[].content[].text (timestamp mode)Caption text for the segmentWelcome to the channel!

Notes:

  • Choose outputFormat "text" to get a single string per language, or "timestamp" to export an array of caption segments with precise timing.
  • Results are saved to the Apify dataset so you can export to JSON, CSV, or Excel for downstream use.

Key features

  • ⚡ Bold batch processing & live streaming — Paste multiple video URLs and get results pushed to the dataset as each URL completes. Perfect for bulk YouTube transcript downloader workflows.
  • 🧭 Flexible transcript formats — Select outputFormat "text" to get readable paragraphs or "timestamp" to export granular caption segments with start/end times.
  • 🌍 Multilingual & auto-generated control — Include Non-English Transcripts and optionally include English Auto-Generated captions to download auto-generated YouTube captions when needed.
  • 🛡️ Smart proxy defaults — Uses Apify RESIDENTIAL proxy by default to reduce YouTube IP blocking, improving stability for large jobs.
  • 🔁 Clean URL handling — Accepts both watch and youtu.be links; the actor extracts the video ID and normalizes the URL automatically.
  • 👩‍💻 Developer-friendly & API-ready — Run via the Apify platform and fetch dataset items programmatically to build your own YouTube transcript API workflows.
  • 📦 Structured exports — Export datasets to JSON/CSV/Excel and feed them into NLP pipelines, search indexes, or BI dashboards to export YouTube captions at scale.
  • 🧱 Production-grade reliability — Built on the Apify Actor runtime with robust logging and per-URL progress, ideal for automated pipelines.

How to use Youtube Transcript Scraper - step by step

  1. Create or log in to your Apify account.
  2. Open the “youtube-transcript-scraper” actor on Apify.
  3. Add input data:
  4. Choose transcript options:
    • includeEnglishAG: Toggle whether to include English auto-generated subtitles.
    • includeNonEnglish: Toggle whether to include non-English transcripts.
    • outputFormat: Select "text" for a single-string transcript or "timestamp" for per-segment captions.
  5. Configure proxy (optional):
    • proxyConfiguration: Leave empty to use Apify RESIDENTIAL by default, or customize if needed to improve stability.
  6. Start the run:
    • Click Start. Each processed URL is appended immediately to the dataset.
  7. Download results:
    • Go to the run’s Dataset tab and export to JSON, CSV, or Excel. Use the plain text output for quick reading or the timestamped output to drive captioning workflows.

Pro Tip: Use "timestamp" output and transform it in your own code to convert YouTube transcript to SRT for subtitle workflows.

Use cases

Use case nameDescription
Content teams – repurpose video to SEO contentExtract YouTube transcript text and turn long-form videos into blogs, newsletters, and social captions to boost search visibility.
Researchers & analysts – qualitative analysisDownload YouTube transcripts to run topic modeling, summarization, and sentiment analysis across interviews and lectures.
Accessibility & caption QAExport YouTube captions to review auto-generated accuracy and prepare compliant subtitles.
Social media teams – highlight extractionCopy YouTube transcript text to identify quotes and timestamps for shorts, reels, and teasers.
Academic workflows – lecture indexingBuild searchable course archives by exporting transcript text and timestamps for fast reference.
Developer pipelines – API ingestionFeed timestamped captions into search, LLM, or RAG systems for retrieval over large video libraries.
Localization prep – translation handoffGather source text for translators by exporting YouTube captions in original languages.

Why choose Youtube Transcript Scraper?

This actor prioritizes precision, scalability, and reliability for transcript extraction from public YouTube videos.

  • ✅ Accurate formats: Choose between readable text or detailed timestamps to match your workflow needs.
  • 🌐 Language control: Include non-English tracks and optionally include English auto-generated captions for broader coverage.
  • 📈 Built for scale: Paste many URLs and stream results to the dataset as each finishes — great for playlists and channels.
  • 🧩 Developer access: Use Apify’s platform and dataset APIs to automate end-to-end pipelines for a YouTube transcript API experience.
  • 🔒 Safer than extensions: Runs server-side with proxy support; avoids brittle browser-based tools and manual copy-paste.
  • 💾 Easy exports: Pull results as JSON, CSV, or Excel for downstream analysis and integration.
  • 🛡️ Stable infrastructure: Defaults to RESIDENTIAL proxies to reduce blocking and keep bulk runs consistent.

Bottom line: a dependable YouTube caption scraper that replaces extensions and ad-hoc scripts with structured, repeatable results.

Yes — when done responsibly. This tool automates access to transcripts that are publicly available on YouTube. It does not log in or access private videos.

Guidelines:

  • Only scrape public videos and captions that are available without authentication.
  • Respect YouTube’s Terms of Service and applicable laws (e.g., GDPR/CCPA) in your region.
  • Use results for analysis, research, or accessibility; obtain permission if republishing content.
  • Do not attempt to access private or region-restricted videos.

For edge cases or commercial redistribution, consult your legal team.

Input parameters & output format

Example JSON input

{
"urls": [
"https://youtu.be/dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=4KbrxIpQgkM"
],
"includeEnglishAG": true,
"includeNonEnglish": false,
"outputFormat": "text",
"proxyConfiguration": {}
}
FieldTypeRequiredDefaultDescription
urlsarrayYes[]One or more YouTube video URLs to process. Each completed URL is appended immediately to the dataset.
includeEnglishAGbooleanNotrueWhether to include English auto-generated transcripts.
includeNonEnglishbooleanNofalseWhether to include non-English transcripts.
outputFormatstring ("timestamp" or "text")No"text"Format of transcript output: "timestamp" returns detailed timestamps, "text" returns plain text.
proxyConfigurationobjectNo{}Proxy configuration. Uses Apify RESIDENTIAL proxy by default to bypass YouTube IP blocking. If not configured, will try to use Apify proxy automatically.

Example JSON output

Below is a single dataset item when outputFormat is "text":

{
"id": "dQw4w9WgXcQ",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"input": "https://youtu.be/dQw4w9WgXcQ",
"transcripts": [
{
"language": "English (auto-generated)",
"content": "We're no strangers to love You know the rules and so do I ..."
}
]
}

Notes:

  • When outputFormat is "timestamp", transcripts[].content is an array of objects with startMs, endMs, startTime, and text.
  • If captions are unavailable or filtered out by your settings, transcripts may be an empty array.

FAQ

Do I need to log in to use this YouTube caption scraper?

No. The actor does not require login or cookies. It uses public endpoints via the underlying youtube-transcript-api to extract available captions.

Can it download auto-generated YouTube captions?

Yes. Set includeEnglishAG to true to include English auto-generated captions when available. Leave it false to exclude them.

Does it support non-English subtitles?

Yes. Enable includeNonEnglish to include non-English transcripts provided by YouTube. If disabled, non-English tracks will be skipped.

Can I get YouTube subtitles as text or with timestamps?

Yes. Choose outputFormat "text" to get a single string, or "timestamp" to export per-segment captions with start/end timing. This covers both “get YouTube subtitles as text” and “export YouTube captions” needs.

Is there an API to automate runs or integrate results?

Yes. As an Apify actor, you can trigger runs via the Apify API and fetch dataset items programmatically to build a YouTube transcript API workflow.

How many videos can I process in one run?

You can pass multiple URLs in urls. The actor processes them and pushes each result as it completes. Scale primarily depends on your Apify plan and proxy stability.

Can it download SRT from YouTube directly?

No. The actor does not output SRT files. However, if you select "timestamp", you can convert the structured segments (startMs, endMs, text) to SRT in your own script.

Does this replace a YouTube transcript Chrome extension?

It serves the same purpose without a browser. Instead of a YouTube transcript chrome extension, this server-side solution is more stable for bulk tasks and automation.

What happens if a video has no captions?

If a video has no available transcripts (or your filters exclude them), the transcripts array will be empty for that item.

Are playlists or channels supported?

Provide individual video URLs in urls. You can queue as many video links as needed to build a bulk YouTube transcript downloader workflow.

Closing CTA / Final thoughts

The Youtube Transcript Scraper is built to extract clean, structured YouTube transcripts at scale. With flexible formats, language controls, and stable proxy defaults, it’s ideal for marketers, developers, analysts, and researchers who need reliable caption data fast. Export your dataset to JSON/CSV/Excel, automate via the Apify API, and integrate into NLP, search, or analytics pipelines. Start extracting smarter — and turn unstructured video into actionable, searchable text.