YouTube Transcript Scraper & Captions
Pricing
from $0.003 / video scraped
YouTube Transcript Scraper & Captions
YouTube transcript scraper and transcript API alternative for public videos, channels, and searches. Export captions, subtitles, timestamped segments, metadata, comments, and optional AI chapters. No YouTube API key required.
Pricing
from $0.003 / video scraped
Rating
0.0
(0)
Developer
Nick
Maintained by CommunityActor stats
0
Bookmarked
16
Total users
11
Monthly active users
5 days ago
Last modified
Categories
Share
YouTube Transcript Scraper - Captions & Subtitles
Use this YouTube transcript scraper to extract captions, subtitles, transcript text, timestamped transcript segments, video metadata, comments, and optional AI chapters from public YouTube videos. Paste video URLs for the fastest path, or use channel and search modes when you need transcripts across many videos. No YouTube API key is required.
Best first run
{"mode": "transcript","videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],"transcriptLanguages": ["en"]}
Use this actor for YouTube transcript scraping, YouTube captions scraping, subtitle exports, content research, creator analytics, and video monitoring. Start with transcript mode, confirm captions are available for your target videos, then add comments, channel analysis, or AI chapters.
What you get back
- One video row in the Videos Dataset with normalized
url,video_id, title, channel, views, likes, duration, publish date, thumbnail, tags, and engagement rate. - One transcript row with
transcript_text,transcript_segments,transcript_language,transcript_kind, andtranscript_available. - Good starter run: transcript mode, one URL, comments off, AI off. Add comments, channel analysis, or AI chapters after the transcript path works for your target content.
Generate YouTube transcripts, subtitles, AI chapter markers, and channel analytics without an API key - a pay-per-video alternative to vidIQ ($39-$415/mo), TubeBuddy ($9-$50/mo), and manual transcription services ($1.00-$1.50/audio-min via Rev). At $0.003 per video, $0.005 per transcript, and $0.01 per chapter set, a 100-video transcript + chapter audit costs about $1.80 instead of a monthly SaaS seat, and a 60-minute podcast transcript/chapter pass costs cents instead of manual transcription rates.
Whether you are a marketing agency benchmarking influencer channels, a content creator optimizing your upload strategy, or a brand manager evaluating sponsorship opportunities, this actor delivers structured YouTube data plus transcript-aware AI content intelligence.
YouTube Transcript Scraper
Extract YouTube transcript text from one video, a list of URLs, a channel, or a search query. Transcript mode is the fastest path when you only need spoken text and timestamps for summaries, quotes, clips, search indexes, support workflows, or content analysis.
This actor scrapes publicly accessible YouTube data across four modes:
- Transcript mode - fastest first-run path. Paste video URLs and get transcript text plus timestamped segments, with comments and AI disabled for speed.
- Channel mode - analyze any YouTube channel's recent or most popular videos. Retrieves channel metadata (subscriber count, total views, verification status, country, handle) plus per-video data (title, view count, like count, comment count, duration, tags, engagement rate, and Shorts detection). Optionally downloads full transcripts per video.
- Search mode - run a keyword search on YouTube and retrieve the top matching videos with full metadata. Useful for discovering which creators dominate a niche or tracking what content is trending around a topic.
- Video mode - fetch detailed data for specific video URLs you supply. Returns full descriptions, tags, likes, comments, and optional transcript - all in one structured output item per video.
On top of raw scraping, the actor optionally:
- Downloads full video subtitles (human-uploaded captions preferred, auto-generated ASR fallback) in your chosen language.
- Auto-segments each transcript into 3-8 AI-generated chapters with start/end timestamps, a short title, and a one-sentence summary - replacing manual chapter authoring for podcasters, long-form creators, and clip agencies.
- Runs an AI-powered channel analysis that synthesizes video metrics, engagement patterns, and transcript content themes into a strategic report with upload consistency scores, audience engagement scores, and growth recommendations.
No YouTube API key is required. The actor returns public video and channel data as structured JSON.
YouTube Captions Scraper
The YouTube captions scraper workflow returns the spoken content of a video as both plain text and timestamped segments. Human-uploaded captions are preferred when available; auto-generated captions are used as a fallback. You can pass preferred language codes such as ["en", "es", "fr"], and the actor records which transcript language and kind was returned.
Use the flat transcript_text field for search, summarization, and dataset exports. Use transcript_segments when you need timestamps for clips, chapters, quote lookup, or "jump to moment" interfaces.
YouTube Subtitles Scraper
Export YouTube subtitles and caption tracks into structured records that can be searched, summarized, translated, analyzed, or loaded into downstream datasets. Each transcript row includes the video URL, language, caption kind, full transcript text, and timestamped segments.
Features
- Fast transcript-first runs - transcript mode keeps the first run lean, then you can add comments, channel analysis, or AI chapters after the transcript path works
- 4 scraping modes - transcript export, channel analysis, video search, or single video details to match your research workflow
- Full channel metadata - subscriber count, total views, video count, channel description, verification status, country, and handle
- Rich video data - title, view count, like count, comment count, duration, tags, publish date, engagement rate, and Shorts detection
- Engagement rate calculation - automatically computes engagement rate (likes + comments / views) for every video
- Comment extraction - scrape top comments with author name, comment text, like count, and publish date
- Video transcripts and captions - download full video subtitles, preferring human-uploaded captions and falling back to auto-generated ASR. Includes both plain-text and per-segment timestamped output. Language preference is configurable
- AI chapter auto-segmentation - generate 3-8 chapter markers per video with
start,end,title, andsummary. Replaces manual chapter markers for podcasters and long-form creators, and produces searchable timestamps for clip extraction, content search, and video navigation. Only charged when chapters are successfully generated - Sort flexibility - sort channel videos by most popular, newest, or oldest to focus your analysis
- Transcript-aware AI analysis - when transcripts are enabled alongside AI analysis, the report extracts themes, key phrases, and voice-style from actual spoken content instead of only titles
- AI channel analysis - optional AI-powered insights covering content strategy, upload consistency scoring, audience engagement scoring, top-performing themes, and growth recommendations
- Multiple AI providers - choose OpenRouter (recommended - 300+ models), Anthropic (Claude), Google AI (Gemini), OpenAI (GPT), or Ollama (self-hosted) for AI analysis
- Pay-per-event pricing - x402-ready PPE charges for videos, transcripts, comments, chapters, and AI analysis, plus a $5 Skyfire bundle for agent-paid bulk extraction
Use Cases
- Marketing agencies - research competitor channels, benchmark engagement rates across your client's vertical, and build data-driven content strategies. Compare multiple channels side-by-side with AI-generated positioning insights.
- Content creators - analyze your own channel performance to identify top-performing content themes. Understand which video formats, topics, and lengths drive the most engagement. Optimize upload timing and strategy.
- Brand managers - evaluate influencer channels for sponsorship fit. Track engagement rates, audience sentiment through comments, and content consistency before committing marketing budgets.
- Influencer marketing platforms - build and maintain influencer databases with up-to-date metrics. Score channels by engagement quality, not just subscriber count. Detect fake engagement through metric analysis.
- Competitive intelligence teams - monitor competitor YouTube channels for new content, messaging changes, product announcements, and audience reactions. Track engagement trends over time.
- PR and communications professionals - monitor brand mentions and sentiment across YouTube comments. Track how product launches and announcements are received by video audiences.
- Academic researchers - collect structured YouTube data for media studies, audience behavior research, content virality analysis, and platform ecosystem studies. Transcript mode produces a searchable corpus of spoken content across channels for qualitative and NLP analysis.
- AI training and content analysts - use transcript mode to build captioned video datasets for fine-tuning, retrieval workflows, or semantic search. Combine with AI analysis for theme tagging and voice-style classification at scale.
- SEO and content strategists - analyze video tags, titles, and descriptions to understand keyword strategies. Identify content gaps and high-engagement topics in your niche.
- Podcasters and long-form creators - auto-generate chapter markers for every episode when you forgot (or didn't want) to write them manually. Upload the chapters to YouTube's description, surface them in your podcast RSS, or drive a "jump to section" UI on your show page.
- Video clip agencies and editors - use AI chapter output to slice long-form content into topic-coherent clips for shorts, reels, or social distribution without watching the full video first.
Input
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
mode | string | transcript | Scraping mode: transcript, video, channel, or search |
channelUrls | array | -- | YouTube channel URLs (required for channel mode) |
searchQuery | string | -- | Search query (required for search mode) |
videoUrls | array | -- | Video URLs (required for transcript and video modes) |
maxVideos | integer | 10 | Maximum videos per channel or search (1-100) |
includeComments | boolean | false | Scrape top comments for each video |
maxCommentsPerVideo | integer | 10 | Comments per video (1-50) |
includeTranscripts | boolean | false | Download video subtitles (human-uploaded preferred, ASR fallback) |
transcriptLanguages | array | ["en"] | Preferred transcript language codes (first-match priority) |
generateChapters | boolean | false | Auto-segment each transcript into 3-8 AI-generated chapters (requires includeTranscripts + AI provider key) |
sortVideosBy | string | popular | Sort channel videos: popular, newest, oldest |
enableAiAnalysis | boolean | false | Enable AI channel analysis |
llmProvider | string | openrouter | AI provider: openrouter, anthropic, google, openai, or ollama |
llmModel | string | -- | Override default model (leave empty for recommended default) |
openrouterApiKey | string | -- | OpenRouter API key (required if using OpenRouter) |
anthropicApiKey | string | -- | Anthropic API key (required if using Anthropic) |
googleApiKey | string | -- | Google AI (Gemini) API key (required if using Google) |
openaiApiKey | string | -- | OpenAI API key (required if using OpenAI) |
ollamaBaseUrl | string | http://localhost:11434 | Ollama API base URL (for self-hosted AI) |
proxyConfiguration | object | {useApifyProxy: true, apifyProxyGroups: [RESIDENTIAL]} | Proxy settings (RESIDENTIAL strongly recommended) |
Hidden API/CLI aliases are also accepted for agent workflows: channelUrl for one channel URL; videoUrl, url, urls, or links for video URLs; query, q, search, keyword, or searchTerm for searchQuery; and maxItems for maxVideos.
Pricing
This actor uses Apify's pay-per-event pricing model. You only pay for what you scrape. The individual events are x402-ready; the Skyfire route uses the $5 bulk bundle because Skyfire enforces a minimum charge per actor invocation.
| Event | Price | Description |
|---|---|---|
video-scraped | $0.003 | Charged per video extracted |
transcript-scraped | $0.005 | Charged per transcript successfully downloaded (only when includeTranscripts is enabled) |
comments-scraped | $0.002 | Charged per video whose comments were successfully scraped (only when includeComments is enabled and comments were returned) |
chapter-generated | $0.01 | Charged per video chapter-set successfully generated (only when generateChapters is enabled and chapters were produced) |
ai-analysis-completed | $0.05 | Charged per AI channel analysis report |
Skyfire bulk bundle (AI-agent payment rail)
A skyfire-bundle-500-videos event ships at $5.00 per 500 videos for AI agents paying via the Skyfire JWT rail. Effective rate: $0.01/video - a 3.3x premium over the raw video-scraped baseline ($0.003) - but the bundle covers the full extractor stack (transcript + AI chapters + AI channel analysis) under one prepaid call, displacing manual Rev.com transcription at $90/hr. Skyfire requires a $5 minimum charge per actor invocation, so the bundle is the canonical agent-payment-rail-compatible option. Pay-as-you-go users via Apify's standard PPE rail still get the cheaper individual-event pricing.
Cost Examples
| Scenario | Videos | Transcripts | Chapters | AI Analysis | Total Cost |
|---|---|---|---|---|---|
| Quick channel check | 10 | No | No | No | $0.03 |
| Channel deep dive | 20 | No | No | Yes | $0.11 |
| Channel deep dive + transcripts | 20 | Yes (20) | No | Yes | $0.21 |
| Podcast chapter generation | 20 | Yes (20) | Yes (20) | No | $0.36 |
| Full content intelligence pass | 20 | Yes (20) | Yes (20) | Yes | $0.41 |
| Multi-channel comparison | 50 | No | No | Yes | $0.20 |
| Multi-channel + transcripts | 50 | Yes (50) | No | Yes | $0.45 |
| Large content audit | 100 | No | No | Yes | $0.35 |
vs. commercial alternatives: vidIQ Pro charges $49+/mo and TubeBuddy $19+/mo for YouTube analytics, while the YouTube Data API imposes strict rate limits and requires authentication. This actor uses pay-per-event with no subscription: $0.003/video and zero monthly fees.
Typical Runtime
Typical run times:
- 10 videos without comments: ~30-60 seconds
- 20 videos without comments: ~1-2 minutes
- 20 videos with comments: ~2-4 minutes
- 50 videos with comments: ~5-8 minutes
- AI analysis adds ~15-30 seconds to any run
Output
The actor writes to multiple datasets so transcript, comment, and analyst workflows can consume the clean table they need:
- Videos Dataset - default video and channel records with metadata, metrics, optional embedded transcripts, comments, and chapters.
- Transcripts Dataset - one transcript row per video with
transcript_textand timestampedtranscript_segments. - Comments Dataset - one row per scraped comment with video context attached.
- AI Analysis Dataset - optional channel, search, or content-analysis records.
- Diagnostics Dataset - invalid input, blocked request, target-error, and no-result records.
Channel Mode Output
In channel mode, each result contains channel metadata, video list, and optional AI analysis:
{"channel": {"channel_id": "UCBcRF18a7Qf58cCRy5xuWwQ","channel_name": "MKBHD","handle": "@mkbhd","subscriber_count": 19200000,"total_videos": 20,"total_views": 450000000,"description": "...","verified": true,"country": "US"},"videos": [{"video_id": "abc123","title": "Video Title","view_count": 5000000,"like_count": 200000,"comment_count": 15000,"duration_seconds": 720,"engagement_rate": 4.3,"is_short": false,"tags": ["tech", "review"],"comments": [{"comment_id": "Ugy...AaABAg","author": "User","author_channel_id": "UCxxx","text": "Great video!","like_count": 500,"likes": 500,"published_date": "2 weeks ago","url": "https://www.youtube.com/watch?v=abc123&lc=Ugy...AaABAg"}]}],"ai_analysis": {"channel_overview": "...","upload_consistency_score": 8,"audience_engagement_score": 9,"top_performing_themes": ["tech reviews", "smartphones"],"recommendations": ["..."]},"scraped_at": "2026-04-10T10:30:00Z"}
Search and Video Mode
In search and video mode, each result is a flat video object with the following fields:
{"videoId": "dQw4w9WgXcQ","title": "Rick Astley - Never Gonna Give You Up (Official Video)","description": "The official video for "Never Gonna Give You Up" by Rick Astley...","channelId": "UCuAXFkgsw1L7xaCfnd5JJOw","channelTitle": "Rick Astley","viewCount": 1500000000,"likeCount": 16000000,"commentCount": 2100000,"publishedAt": "2009-10-25T06:57:33Z","duration_seconds": 212,"engagement_rate": 1.21,"is_short": false,"tags": ["rick astley", "never gonna give you up", "pop"],"transcript": [{"start": 0.0, "duration": 3.5, "text": "We're no strangers to love..."},{"start": 3.5, "duration": 4.0, "text": "You know the rules and so do I..."}],"chapters": [{"start": 0, "end": 43.0, "title": "Opening verse", "summary": "Artist introduces the emotional stakes of commitment."},{"start": 43.0, "end": 212.0, "title": "Chorus and bridge", "summary": "Repeated declaration of unconditional loyalty."}]}
The transcript field is an array of timed segments (only present when includeTranscripts is enabled). The chapters array is only present when generateChapters is enabled and chapter generation succeeds.
Transcript Output
In transcript mode, transcript retrieval is enabled automatically and comments/AI are skipped for speed. When includeTranscripts is enabled in other modes, each video record also includes these fields, and the same data is written as a clean row in the Transcripts Dataset:
{"transcript_available": true,"transcript_language": "en","transcript_kind": "","transcript_text": "Welcome back to the channel. Today we're looking at...","transcript_segments": [{"start": 0.0, "duration": 4.12, "text": "Welcome back to the channel."},{"start": 4.12, "duration": 3.8, "text": "Today we're looking at..."}]}
transcript_kindis""for human-uploaded captions and"asr"for auto-generated (machine-transcribed) captions. The scraper prefers uploaded over ASR and matches your preferred-language list first.transcript_segmentspreserves timestamps for use cases like chaptering, search-within-video, or clipping.transcript_textis the flat concatenation for full-text search or AI input.- When no captions are available on a video,
transcript_availableisfalseand the other fields are empty.
AI Chapter Output
When generateChapters is enabled, an AI provider key is set, and the transcript was retrieved, each video record gains a chapters array:
{"chapters": [{"start": 0, "end": 42.5, "title": "Intro and new studio setup", "summary": "Creator welcomes viewers and introduces the revamped studio."},{"start": 42.5, "end": 310.0, "title": "iPhone 16 Pro camera test", "summary": "Walk-through of the new 48MP main sensor with outdoor comparison shots."},{"start": 310.0, "end": 720.0, "title": "Battery life and benchmarks", "summary": "Real-world battery test plus synthetic CPU/GPU benchmarks vs. last year's model."}]}
- Timestamps are in seconds (float), so they map directly to YouTube's URL fragment format (
&t=310s). - Typical output is 3-8 chapters per video, based on topic shifts in the transcript.
- Chapters are chronological, non-overlapping, and cover the whole video from 0 to duration.
- If the video has no transcript (
transcript_available: false), no chapters are generated and no charge is emitted. - On rare chapter-generation failures, the
chaptersfield is simply absent and no charge is emitted - your metrics + transcripts are unaffected.
Quick Start
Example: Auto-chapter a long-form podcast (highest-value mode)
Pulls the timed transcript, generates AI chapter timestamps with summaries, and runs sentiment analysis. Best for podcasts, interviews, lectures (videos >10 min).
{"mode": "video","videoUrls": ["https://www.youtube.com/watch?v=YOUR_PODCAST_VIDEO_ID"],"includeTranscripts": true,"generateChapters": true,"enableAiAnalysis": true,"llmProvider": "openrouter"}
Per video: $0.003 (video) + $0.005 (transcript) + $0.01 (chapters) + $0.05 (AI) = ~$0.068/video. 30-min weekly podcast season (10 episodes): $0.68.
Example: Quick Channel Check
The simplest way to get started - analyze a YouTube channel's top videos:
{"mode": "channel","channelUrls": ["https://www.youtube.com/@mkbhd"],"maxVideos": 10}
This scrapes the 10 most popular videos from MKBHD's channel with full metrics.
Example: Scrape Latest Channel Videos with Transcripts
Fetch the 15 newest uploads from a channel and download the English transcript for each video - useful for content monitoring pipelines that need the spoken text, not just titles.
{"mode": "channel","channelUrls": ["https://www.youtube.com/@lexfridman"],"maxVideos": 15,"sortVideosBy": "newest","includeTranscripts": true,"transcriptLanguages": ["en"]}
Each video item includes transcript_text (flat string) and transcript_segments (timestamped array) alongside standard metadata such as view count, like count, and duration.
Example: Search YouTube for a Keyword
Search for a topic and retrieve the top 20 matching videos with full metadata - ideal for market research or tracking which creators dominate a niche.
{"mode": "search","searchQuery": "electric vehicle review 2026","maxVideos": 20}
Returns a list of 20 video objects, each containing title, channel name, view count, like count, engagement rate, duration, and publish date.
Example: Weekly Competitor Channel Analysis
Analyze a competitor's 50 most recent videos and generate an AI report on their content strategy - schedule this weekly to track messaging shifts and upload cadence over time.
{"mode": "channel","channelUrls": ["https://www.youtube.com/@mkbhd"],"maxVideos": 50,"sortVideosBy": "newest","enableAiAnalysis": true,"llmProvider": "openrouter","openrouterApiKey": "sk-or-..."}
Produces a structured AI report covering upload consistency score, audience engagement score, top content themes, and growth recommendations alongside full video metrics for all 50 videos.
Tips for Best Results
- Sort by popular for content strategy. Analyzing a channel's most popular videos reveals what content themes and formats resonate most with their audience.
- Sort by newest for competitive monitoring. Track what competitors are publishing right now and how their recent content performs.
- Use comments for sentiment analysis. Top comments provide qualitative audience feedback that complements quantitative engagement metrics.
- Enable AI analysis for strategic insights. The AI report synthesizes video metrics, engagement patterns, and content themes into actionable content strategy recommendations.
- Combine transcripts with AI analysis. When both are enabled, the AI report adds transcript-derived themes, key phrases, and creator voice-style - surfacing what the creator actually talks about, not just title keywords. This is the most actionable config for sponsorship research, brand-fit evaluation, and content gap analysis.
- Use
generateChaptersfor long-form content. Auto-chaptering shines on podcasts, interviews, tutorials, and explainers where topic shifts matter. For short videos (<2 min) or single-topic content, chapters add little value - skip the feature and save the $0.01 per video. - Compare engagement rates, not just views. A channel with 100K views and 5% engagement is often more valuable for sponsorships than one with 1M views and 0.5% engagement.
- Schedule weekly runs for ongoing monitoring. Track how channels evolve their content strategy and audience engagement over time.
MCP Quickstart - call this actor from Claude / Cursor / ChatGPT
Open Apify's hosted MCP configurator at mcp.apify.com, or install the Apify MCP server in your AI agent of choice:
# Claude Codeclaude mcp add apify -- npx -y @apify/actors-mcp-server --token YOUR_APIFY_TOKEN# Claude Desktop / Cursor (add to mcp.json):{"mcpServers":{"apify":{"command":"npx","args":["-y","@apify/actors-mcp-server","--token","YOUR_APIFY_TOKEN"]}}}
Then prompt the agent:
"Use the harvestlab/youtube-scraper actor on Apify to fetch the 30 most recent videos from the @LexClips channel with timed transcripts and AI-generated chapters. Push the results back as JSON."
Through Apify MCP, the agent can generate the right input, run the actor, and pipe the typed output back into your conversation.
Troubleshooting
No transcripts returned for a video
YouTube restricts transcripts on some videos (auto-captions disabled by the creator, music-industry content, age-gated, member-only, or live streams in progress). The video item is returned with transcript_available: false and empty transcript_text / transcript_segments - and the transcript-scraped event is not charged in this case. Workaround: set includeTranscripts: false to skip the transcript fetch entirely for faster runs, or target channels known to have captions enabled.
403/429 errors or empty results
This actor does not use the YouTube Data API. 403 / 429 responses come from YouTube rate-limiting your exit IP, not from an API quota. Datacenter proxies are flagged most aggressively. Add proxyConfiguration with useApifyProxy: true and apifyProxyGroups: ["RESIDENTIAL"] to rotate through residential exits. If you still see 429s, reduce maxVideos per run and split large jobs across multiple scheduled runs to spread the load over time.
AI analysis fails with "API key missing"
Set openrouterApiKey (or your chosen provider's key) in the actor input, or pass it via the matching env var (OPENROUTER_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, OPENAI_API_KEY). AI analysis is optional - leave enableAiAnalysis: false (the default) to skip it without losing other data. Ollama uses ollamaBaseUrl and needs a reachable local server instead.
Channel scraping returns fewer videos than expected
YouTube paginates channel videos in batches of ~30. The actor follows continuation tokens up to maxVideos (max 100). For channels with fewer total uploads than maxVideos, you simply get all available videos - this is expected, not a failure. Set sortVideosBy to newest or oldest if you need a specific slice instead of the popularity ranking. Very large channels (10,000+ videos) hit rate limits faster - combine residential proxies with multiple smaller scheduled runs.
Videos returning with no or incomplete metadata Private, age-restricted, or member-only videos expose limited data even when visible in search or channel listings. A residential proxy group materially improves coverage for public videos that return incomplete metadata. If a specific video consistently returns empty fields, it is likely restricted at the source and cannot be fully scraped regardless of proxy choice.
Transcript not available for this video
The video may have auto-captions disabled, or the language you requested is not available. Check the transcriptLanguages parameter (note: plural) - if you set ["en"] but the channel primarily publishes in another language, add fallback codes (e.g. ["en", "es", "fr"]). The first matching track in priority order wins; if none match, the first available track is returned. Music videos, live streams in progress, and member-only content routinely have no accessible captions; the actor returns transcript_available: false for these and does not charge the transcript-scraped event.
AI chapters are missing even though transcripts succeeded
Chapter generation requires includeTranscripts: true, generateChapters: true, and a working AI provider key for the chosen llmProvider. If chapter generation fails, the chapters field is silently omitted and no chapter-generated charge is emitted - your video metrics and transcripts are unaffected. Check the run log for chapter generation failed warnings, then verify your API key and credit balance with the provider.
Rate limiting or quota exceeded errors
YouTube applies rate limits at both the IP and session level. Datacenter IPs hit these limits faster than residential ones. Add a proxyConfiguration block with useApifyProxy: true and apifyProxyGroups: ["RESIDENTIAL"] to route requests through rotating residential exits. If you are already using residential proxies and still seeing 429 responses, reduce maxVideos per run and split large jobs across multiple scheduled runs to spread the load over time.
Known Limitations
- YouTube may rate-limit requests for large scraping jobs; the actor retries up to 3 times per page
- Comment scraping retrieves top comments only (sorted by YouTube's relevance algorithm), not all comments
- Subscriber counts and view counts use YouTube's abbreviated format (e.g., "1.2M") which provides approximate values
- YouTube Shorts metrics may be less complete than long-form video metrics
- YouTube's page structure may change; the actor handles multiple layout variations but temporary disruptions are possible
- AI analysis quality depends on the chosen model and the number of videos analyzed (15+ recommended)
- Maximum 100 videos per channel per run
- Private or age-restricted videos cannot be scraped
- Tags coverage is partial on datacenter proxies. YouTube selectively strips the
keywordsfield from video pages returned to IPs it flags as suspicious. Some videos may still return emptytags: []. For maximum tag coverage, use a residential proxy group. Note that many large channels (e.g. MrBeast) genuinely set no tags on their videos - an emptytagsarray in those cases is the correct result, not a scraping failure. - Transcripts are not available on every video. Private, age-restricted, member-only, and music-industry videos typically disable captions. Music videos often have ASR-only captions that transcribe background lyrics. Live streams usually have no captions while live and limited ASR after the stream ends. The actor returns
transcript_available: falseand empty strings/arrays when no captions are accessible - this is not a failure. - AI chapters depend on transcript quality. When the transcript is auto-generated, segmentation follows whatever the caption track captured - filler-heavy, stream-of-consciousness videos may produce fewer or less-precise chapters than tightly-scripted content. Very short videos (<2 minutes) may collapse into a single chapter, which is suppressed (min 3 chapters). On rare chapter-generation failures the
chaptersfield is absent and no charge is emitted.
Frequently Asked Questions
Can I scrape YouTube transcripts without an API key?
Yes. This actor does not require a YouTube Data API key. Provide one or more public video URLs in transcript mode and it returns the available transcript text and timestamped segments.
Does it return timestamped transcript segments?
Yes. Each transcript row includes transcript_text for full-text use and transcript_segments for timestamped use cases such as clipping, chaptering, semantic search, quote lookup, and "jump to moment" interfaces.
Can it scrape auto-generated YouTube captions?
Yes. The actor prefers human-uploaded captions when available and falls back to auto-generated ASR captions. The transcript_kind field shows whether the returned captions are uploaded or ASR-generated.
Why are some YouTube transcripts unavailable?
Some videos do not expose accessible captions: private videos, age-restricted videos, member-only videos, live streams in progress, many music videos, and videos where the creator disabled captions. These return transcript_available: false, and the transcript event is not charged.
What transcripts does this return?
YouTube videos typically have one or both of: human-uploaded captions (manually created by the uploader, highest accuracy) and auto-generated ASR captions (machine transcribed, available on most public videos in many languages). The scraper returns the best match for your transcriptLanguages preference, preferring uploaded over ASR. The transcript_kind field tells you which was returned. Paid/private captions and closed captions locked behind DRM are not accessible.
How do AI chapters work?
When generateChapters is enabled, the actor analyzes each video's timestamped transcript and returns 3-8 chapters with start / end seconds, a concise title, and a 1-sentence summary. This is automatic chapter marking for videos where the creator did not write chapters - useful for podcasters publishing to YouTube, agencies clipping long-form content, or anyone who wants searchable timestamps without manual authoring. Requires both includeTranscripts=true and an AI provider key; you are charged $0.01 per video only when chapters are successfully produced.
Why is this scraper faster and cheaper than others? Transcript-first runs are intentionally lean, which keeps the common URL-to-transcript workflow fast and inexpensive.
Can I scrape multiple channels in one run?
Yes. In channel mode, provide multiple URLs in the channelUrls array. Each channel will be scraped sequentially with up to maxVideos videos per channel. AI analysis is generated per channel.
How is engagement rate calculated? Engagement rate is calculated as (like count + comment count) / view count, expressed as a percentage. This metric normalizes for audience size, allowing fair comparison between channels of different scales.
Does this scraper detect YouTube Shorts?
Yes. Each video includes an is_short boolean field that identifies whether the content is a YouTube Short. This allows you to filter or analyze Shorts separately from long-form content.
Can I search for videos by keyword?
Yes. Use search mode with a searchQuery parameter. For example, search for "product review 2026" to find recent review content. Search mode returns videos from across YouTube, not limited to specific channels.
How do I track channel performance over time? Schedule regular runs on Apify (weekly or monthly) for the same channels. Over time, you build a dataset showing subscriber growth, engagement trends, content theme shifts, and upload frequency changes.
Use with AI agents (LangChain & LangGraph)
Outputs from this actor are structured JSON: video metadata, timed transcripts, and AI chapters can be used directly in a LangChain Tool, LangGraph node, search index, spreadsheet, or internal workflow.
LangChain - wrap the actor as a Tool
from apify_client import ApifyClientfrom langchain.tools import Toolclient = ApifyClient("YOUR_APIFY_TOKEN")def youtube_scraper(channel_handle: str) -> list[dict]:run = client.actor("harvestlab/youtube-scraper").call(run_input={"mode": "channel","channelUrls": [f"https://www.youtube.com/{channel_handle}"],"maxVideos": 20,"includeTranscripts": True,})return list(client.dataset(run["defaultDatasetId"]).iterate_items())youtube_tool = Tool(name="youtube_scraper",description="Fetch YouTube channel videos + transcripts. Input: a channel handle like '@mkbhd'.",func=youtube_scraper,)# Agent calls it: youtube_tool.invoke({"channelHandle": "@mkbhd"})
LangGraph - call the actor inside a StateGraph node
from typing import TypedDictfrom apify_client import ApifyClientfrom langgraph.graph import StateGraph, ENDclient = ApifyClient("YOUR_APIFY_TOKEN")class State(TypedDict):channelHandle: strtranscripts: list[dict]def fetch_youtube(state: State) -> State:run = client.actor("harvestlab/youtube-scraper").call(run_input={"mode": "channel","channelUrls": [f"https://www.youtube.com/{state['channelHandle']}"],"maxVideos": 10,"includeTranscripts": True,})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())return {**state, "transcripts": items}graph = StateGraph(State)graph.add_node("fetch_youtube", fetch_youtube)graph.set_entry_point("fetch_youtube")graph.add_edge("fetch_youtube", END)app = graph.compile()
See also apify/actor-templates/js-langchain and js-langgraph-agent for full template scaffolds in JavaScript.
Scheduling and webhooks
Schedule daily or weekly YouTube runs in Apify Console to keep a live feed of channel uploads or keyword results. Wire a webhookUrl in n8n or Make to push each new video into a Notion content calendar, Slack editorial alert, or CMS queue the moment a run completes.
Legal and Compliance
This actor scrapes publicly available data. By using this actor, you agree to the following:
- Your responsibility: You are solely responsible for ensuring your use complies with all applicable laws, regulations, and the target website's terms of service. This includes but is not limited to GDPR (EU), CCPA (California), and other data protection laws in your jurisdiction.
- No legal advice: This actor does not constitute legal advice. Consult a qualified attorney if you have questions about the legality of your specific use case.
- Intended use: This actor is designed for legitimate business purposes such as market research, competitive analysis, and academic research using publicly accessible data.
- Data handling: You are responsible for how you store, process, and share any data collected. Ensure you have a lawful basis for processing any personal data under applicable privacy laws.
- Rate limiting: This actor implements polite crawling practices including request delays and retry backoff to minimize impact on target servers.
- No warranty: This actor is provided "as is" without warranty. Data accuracy depends on the target website's content and structure.
- YouTube data: YouTube's terms of service restrict automated data collection. Consider using the official YouTube Data API for production use cases. This actor is intended for analytics and research purposes.
- Personal data notice: Channel and video data may include creator names, profile images, and commenter usernames. Under GDPR and similar regulations, this constitutes personal data subject to data protection requirements. Ensure you have a lawful basis for processing. Do not use extracted data for unsolicited contact or harassment.
Related Actors
- Google News Monitor - Pair YouTube creator coverage with Google News article tracking for cross-channel media monitoring on the same topic, brand, or industry event.
- Reddit Scraper - Capture community discussion of the videos you're tracking; cross-reference YouTube engagement with Reddit thread sentiment for a fuller audience-reaction picture.
- ProductHunt Scraper - Track creator-economy product launches and creator tools alongside YouTube channel analytics for end-to-end creator and launch intelligence.