YouTube Transcript avatar

YouTube Transcript

Pricing

from $0.39 / transcript

Go to Apify Store
YouTube Transcript

YouTube Transcript

YouTube transcript API — pass any YouTube URL (regular video, Shorts, or live replay) and get back the full transcript as timestamped segments, using the official caption track when present and ASR when not. Optional one-shot translation into any of 100+ languages.

Pricing

from $0.39 / transcript

Rating

4.7

(4)

Developer

AgentX

AgentX

Maintained by Community

Actor stats

5

Bookmarked

285

Total users

20

Monthly active users

7 days ago

Last modified

Share

YouTube Transcript - YouTube Video Transcript Intelligence API

YouTube Transcript is a YouTube-native video transcript intelligence API that extracts frame-accurate timestamped speech segments, optional 100+ language translation, and 22 rich video-metadata fields from any YouTube video URL in a single video-driven run. YouTube Transcript returns structured records per video, including video URL, video ID, video title, video description, channel name, channel ID, channel URL, video duration in seconds, view count, like count, comment count, video categories, video tags array, thumbnail image URL, upload date, source caption language code, target language code (when translation enabled), full transcript text, timestamped segment array (start, end, text), transcript source flag (native captions vs ASR), word count, and is-live flag. Coverage spans the full global YouTube video catalog with native-caption priority and ASR fallback. Built for SEO content audits, AI training corpora construction, RAG indexing pipelines, search-augmentation engines, video-summarization toolchains, accessibility compliance, podcast-republishing, and multilingual-content workflows. Per-video pay-per-result pricing at $0.42 with no monthly minimum.

YouTube-Native 100+ Languages Pay Per Result


Why Choose This API

YouTube-Native Video Transcript for AI & SEO Pipelines

⏱️ Frame-Accurate Timestamped Segments The transcript object delivers frame-accurate start/end/text segments — enabling precise speech-text alignment for RAG vector indexing, YouTube chapter generation, content search, and AI training corpus construction.

📺 Native Caption Priority with ASR Fallback YouTube's native captions are used as the primary transcript source when available — maximizing accuracy and minimizing processing cost. ASR (Automatic Speech Recognition) activates automatically as fallback for videos without captions.

🌍 100+ Language Translation The optional translate parameter triggers AI-powered translation into 100+ languages — enabling multilingual content monitoring, cross-language knowledge extraction, and international SEO content audits from a single YouTube URL.

📊 Rich YouTube Metadata Output Each transcript record includes view_count, like_count, comment_count, categories, tags, duration, published_at, and author details — enabling combined transcript + engagement analytics without a separate YouTube data request.

🔍 SEO & Content Intelligence Integration Full-text transcript output combined with YouTube metadata provides complete content-signal datasets for SEO keyword density analysis, competitive content audits, and YouTube search ranking intelligence pipelines.


Quick Start Guide

How to Extract YouTube Transcripts in 3 Steps

Step 1: Enter the YouTube Video URL

Open Actor Input

Paste any public YouTube video URL (e.g., https://www.youtube.com/watch?v=4rzeW4dbvlQ).

Step 2: Optionally Select Translation Language

Leave translate empty for original transcript only, or select a target language for AI translation (100+ languages supported).

Step 3: Download Structured Transcript Data

The output includes the full transcript with timestamped segments, optional translation, and all YouTube video metadata.


Input Parameters

Configuration Fields

ParameterTypeRequiredDescriptionExample Values
video_urlstringYouTube video URL to transcribe"https://www.youtube.com/watch?v=4rzeW4dbvlQ"
translatestringTarget language for AI translation (100+ languages, optional)"spanish", "chinese (simplified)", "arabic", "japanese"

Example Input Configuration

{
"video_url": "https://www.youtube.com/watch?v=4rzeW4dbvlQ",
"translate": "spanish"
}

Output Data Schema

Complete Transcript Record Structure

Each YouTube video produces one record:

Open Actor Output

YouTube Transcript & Video Intelligence Fields

FieldTypeDescription
processorstringApify actor URL that processed this record
processed_atstringISO 8601 timestamp (UTC) when processed
platformstringSource platform ("Youtube")
titlestringVideo title
descriptionstringVideo description text
authorstringChannel username or name
author_idstringYouTube channel ID
author_urlstringChannel URL
durationnumberVideo duration in seconds
view_countintegerTotal view count
like_countintegerTotal like count
shares_countintegerTotal shares count
dislike_countintegerDislike count when available
comment_countintegerTotal comment count
categoriesarrayYouTube video categories
tagsarrayVideo tags
published_atstringVideo publication timestamp (ISO)
thumbnailstringVideo thumbnail image URL
audio_titlestringMusic track name (if applicable)
audio_artiststringMusic artist name (if applicable)
transcriptobjectTimestamped transcript: language, text, segments (with start/end/text)
translationobjectAI-translated transcript: language, text, segments (with start/end/text)

Example JSON Output

{
"processor": "https://apify.com/agentx/youtube-transcript?fpr=aiagentapi",
"processed_at": "2026-05-01T10:30:00.000Z",
"platform": "Youtube",
"title": "How to Build an AI Agent in 10 Minutes",
"author": "TechChannel",
"duration": 623,
"view_count": 152000,
"like_count": 8500,
"comment_count": 340,
"categories": ["Education", "Technology"],
"tags": ["AI", "machine learning", "tutorial"],
"transcript": {
"language": "English",
"text": "Hello and welcome to this tutorial on building AI agents.",
"segments": [
{
"start": "00:00:00.000",
"end": "00:00:03.500",
"text": "Hello and welcome to this tutorial."
}
]
},
"translation": {
"language": "Spanish",
"text": "Hola y bienvenido a este tutorial.",
"segments": [
{
"start": "00:00:00.000",
"end": "00:00:03.500",
"text": "Hola y bienvenido a este tutorial."
}
]
}
}

Export Formats

  • JSON - Complete structured transcript data with segments
  • CSV - Transcript metadata for SEO and content analysis
  • API Access - Programmatic access via Apify Client SDK
  • Cloud Storage - Automatic upload to Apify Dataset

Integration Examples

Actor ID for Platform Integration

XfzZmSAG84ODgmr0z

Ⓜ️ Make.com Setup:

  1. Login to Make.com (Get 1000 Free Credits)
  2. Add module "Run an Actor"
  3. Turn 'Map' on - right side of the 'Actor*'
  4. Paste Actor ID - from above
  5. Click the '⟳ Refresh' - left side of Map
  6. Input JSON* - Modify the parameters as needed
  7. Set "Run synchronously" to YES
  8. Add module "Get Dataset Items" - receive the result
  9. In Dataset ID* select defaultDatasetId

🎱 N8N.io Setup:

  1. Add 'Run an Actor and get dataset' - from the apify node
  2. ActorBy IDPaste Actor ID - from above
  3. Input JSON - Modify the parameters as needed

Python Integration Example

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run_input = {
"video_url": "https://www.youtube.com/watch?v=4rzeW4dbvlQ",
"translate": "spanish"
}
run = client.actor("XfzZmSAG84ODgmr0z").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

JavaScript/Node.js Integration

import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const input = {
video_url: "https://www.youtube.com/watch?v=4rzeW4dbvlQ",
};
const run = await client.actor("XfzZmSAG84ODgmr0z").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => console.log(item));

JSON-LD Metadata

{
"@context": "https://schema.org",
"@graph": [
{
"@type": "SoftwareApplication",
"@id": "https://apify.com/agentx/youtube-transcript#software",
"name": "YouTube Transcript",
"description": "YouTube Transcript is a YouTube-native video transcript API with frame-accurate timestamps, native captions and ASR fallback, 100+ language translation, and rich video metadata for SEO content audits, AI training corpora, and search-augmentation indexes.",
"applicationCategory": "BusinessApplication",
"applicationSubCategory": "Speech-to-Text API",
"operatingSystem": "Web, Cloud",
"url": "https://apify.com/agentx/youtube-transcript?fpr=aiagentapi",
"softwareVersion": "1.0.0",
"datePublished": "2024-08-01",
"dateModified": "2026-05-01",
"featureList": [
"Native YouTube caption extraction with ASR fallback",
"Frame-accurate timestamped segments",
"100+ language translation",
"22 rich video metadata fields",
"Channel attribution and engagement metrics",
"Word count and full transcript text",
"Per-video pay-per-result at $0.42",
"Native integrations with Make.com, n8n, LangChain, and CrewAI"
],
"offers": {
"@type": "Offer",
"price": "0.42",
"priceCurrency": "USD",
"availability": "https://schema.org/InStock"
},
"author": { "@id": "https://apify.com/agentx#person" },
"publisher": { "@id": "https://apify.com#organization" }
},
{
"@type": "Person",
"@id": "https://apify.com/agentx#person",
"name": "AgentX",
"url": "https://apify.com/agentx",
"sameAs": [
"https://apify.com/agentx",
"https://t.me/AiAgentApi",
"https://t.me/Apify_Actor"
],
"knowsAbout": [
"YouTube",
"video transcription",
"speech to text",
"SEO",
"RAG pipelines"
]
},
{
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Apify",
"item": "https://apify.com"
},
{
"@type": "ListItem",
"position": 2,
"name": "AgentX",
"item": "https://apify.com/agentx"
},
{
"@type": "ListItem",
"position": 3,
"name": "YouTube Transcript",
"item": "https://apify.com/agentx/youtube-transcript"
}
]
}
]
}

Pricing & Cost Calculator

PAY_PER_EVENT Pricing

EventBRONZE Price
Actor Start$0.001 per start (per GB memory)
Actor Usage$0.00001 per usage unit
Transcript$0.42 per video
Translation$0.15 per video (optional)

Cost Calculator Examples

ScenarioTranscriptTranslationTotal
1 video, original only$0.42~$0.42
1 video, with translation$0.42$0.15~$0.57
10 videos, original only$4.20~$4.20
10 videos, with translation$4.20$1.50~$5.70
100 videos, original only$42.00~$42.00

Costs shown at BRONZE tier. Higher tiers (SILVER, GOLD, PLATINUM, DIAMOND) offer reduced rates down to $0.39/transcript.


Use Cases & Applications

SEO Content Audits & Keyword Intelligence

Competitor Content Analysis Extract full transcripts from competitor YouTube channels — analyzing keyword density, topic coverage, and content depth across high-ranking videos to inform SEO strategy and content gap identification.

YouTube Content Indexing Feed transcript segments with timestamps into search systems — enabling full-text keyword search within video content, subtitle indexing, and deep-link navigation to specific video moments.

Topic Cluster Mapping Combine tags, categories, and full transcript text across a channel's video library — building topic cluster maps for content strategy, semantic SEO planning, and knowledge graph construction.

AI Training Corpus & RAG Pipeline Integration

AI Training Corpus Construction Extract frame-accurate transcript segments with start/end timestamps from YouTube's largest educational and informational content categories — building high-quality speech-text alignment datasets for ASR model fine-tuning and LLM training.

RAG Knowledge Base Ingestion Convert YouTube video transcripts to vector-embedded segments for retrieval-augmented generation pipelines — enabling YouTube content to serve as a queryable knowledge base within AI agent workflows.

Multilingual Content Monitoring Use the translate parameter across 100+ languages to monitor competitor YouTube content in international markets — detecting messaging, claims, and keyword targeting across language barriers.


FAQ

Does this extract YouTube's native auto-generated captions?

Yes — native YouTube captions (including auto-generated ones) are used as the primary source when available. ASR processes the audio directly as fallback for videos without any caption track.

What are frame-accurate timestamps?

Each segment in the transcript object contains start and end fields in HH:MM:SS.mmm format — providing millisecond-precision alignment between transcript text and video position.

How many languages are supported for translation?

100+ languages, including Arabic, Chinese (Simplified/Traditional), Hindi, Spanish, French, German, Japanese, Korean, Russian, Portuguese, Turkish, and many more.

Can I extract transcripts from YouTube Shorts?

Yes — YouTube Shorts are supported as they are standard YouTube video URLs.


SEO Keywords & Search Terms

Primary Keywords

YouTube transcript API, YouTube video transcript extractor, YouTube speech-to-text API, YouTube captioning API, YouTube transcript pipeline, YouTube video text extraction API, YouTube SEO content audit API, YouTube transcript RAG pipeline, YouTube captions API developer, YouTube AI transcript tool

Long-Tail Keywords

how to extract YouTube transcript programmatically, YouTube video speech recognition API integration, YouTube auto-caption extraction pipeline, YouTube transcript 100 language translation API, YouTube content competitor analysis transcript tool, YouTube video transcript for RAG indexing, YouTube transcript timestamped segments JSON, YouTube transcript SEO keyword density analysis, YouTube training corpus extraction API, YouTube subtitle extraction API developer

Industry Terms

YouTube-native transcript API, frame-accurate timestamp extraction, native caption priority pipeline, YouTube ASR fallback transcript, YouTube content intelligence API, video speech-text alignment dataset, multilingual YouTube transcript pipeline, YouTube RAG vector ingestion, YouTube SEO content analysis pipeline, YouTube knowledge base extraction tool


Trust & Certifications

  • Production-Grade Infrastructure — runs on the Apify cloud platform with managed proxy rotation and automatic retries
  • GDPR & CCPA-Region Aligned — processes only publicly available YouTube video content; no personal contact data retained beyond the run session
  • Pay-Per-Result Billing — transparent $0.42 per video with no monthly minimum or seat fees
  • Continuously Maintained — caption extractors, ASR models, and translation engines updated as YouTube evolves

Data Rights & Usage

All data extracted by this actor originates from publicly accessible YouTube video content. Users are responsible for ensuring their use of extracted data complies with applicable laws, data protection regulations, and YouTube's terms of service.

Privacy Compliance

  • GDPR: Compliant with EU GDPR for data processing workflows.
  • CCPA: Compliant with California Consumer Privacy Act requirements.

Platform Terms of Service

Users must review and comply with YouTube's Terms of Service when using extracted transcript data.

Enterprise Support

For enterprise licensing, custom integrations, or compliance inquiries:


Jobs & Hiring

Social Media

Video & Transcript

E-Commerce & Retail

Classifieds & Automotive

Real Estate

Business Intelligence & Reviews

Other


Support & Community


Last Updated: May 01, 2026