YouTube MCP Transcript & AI Chapter — Analytics avatar

YouTube MCP Transcript & AI Chapter — Analytics

Pricing

Pay per usage

Go to Apify Store
YouTube MCP Transcript & AI Chapter — Analytics

YouTube MCP Transcript & AI Chapter — Analytics

Skyfire/x402 payment-ready YouTube transcripts + AI auto-chapters + channel analytics at $0.003/video — 99%+ run success. RAG-ready, MCP-native. Replaces Rev.com $90/hr + vidIQ + Otter.ai/Sonix/Descript. 5 LLMs, 50+ langs.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Nick

Nick

Maintained by Community

Actor stats

0

Bookmarked

7

Total users

5

Monthly active users

an hour ago

Last modified

Share

YouTube Transcript & AI Chapter Generator — Analytics

🤖 MCP-Ready — works as a tool for Claude, Cursor, ChatGPT, and other AI agents via the Apify MCP Server. Typed inputs + structured outputs (dataset_schema) deployed in Cycle 73.

🧰 Works with: LangChain · LangGraph · CrewAI · LlamaIndex · PydanticAI · Mastra AI · Bee AI — call this actor as a tool from any Apify MCP or actor-templates-backed AI framework.

💰 Pay-per-result, not per compute unit — you know your cost BEFORE pressing Run. No surprise compute charges, no monthly minimums, no SaaS subscription. Compare to Apify's default $0.20/CU model where cost depends on actor runtime, memory, and data volume.

🔁 Apify rental alternative — Apify is sunsetting the rental pricing model in October 2026. This Actor is already 100% pay-per-event: drop in as a 1:1 replacement for any rental-based competitor without a contract or monthly minimum.

Generate YouTube transcripts, AI chapter markers, and channel analytics without an API key or browser — a pay-per-video alternative to vidIQ ($39-$415/mo), TubeBuddy ($9-$50/mo), and manual transcription services ($1.00-$1.50/audio-min via Rev). At $0.005 per video + $0.02 per AI chapter set, a 100-video competitor audit costs ~$2.50 instead of $415/mo, and clip agencies replace ~$90 of Rev transcription on a 60-min podcast with ~$0.025 here. Lightweight HTTP requests parse ytInitialData directly — significantly faster and cheaper than browser-based scrapers.

Whether you are a marketing agency benchmarking influencer channels, a content creator optimizing your upload strategy, or a brand manager evaluating sponsorship opportunities, this actor delivers structured YouTube data plus transcript-aware AI content intelligence.

What it does

This actor scrapes publicly accessible YouTube data across three modes:

  • Channel mode — analyze any YouTube channel's recent or most popular videos. Retrieves channel metadata (subscriber count, total views, verification status, country, handle) plus per-video data (title, view count, like count, comment count, duration, tags, engagement rate, and Shorts detection). Optionally downloads full transcripts per video.
  • Search mode — run a keyword search on YouTube and retrieve the top matching videos with full metadata. Useful for discovering which creators dominate a niche or tracking what content is trending around a topic.
  • Video mode — fetch detailed data for specific video URLs you supply. Returns full descriptions, tags, likes, comments, and optional transcript — all in one structured output item per video.

On top of raw scraping, the actor optionally:

  • Downloads full video subtitles (human-uploaded captions preferred, auto-generated ASR fallback) in your chosen language.
  • Auto-segments each transcript into 3–8 AI-generated chapters with start/end timestamps, a short title, and a one-sentence summary — replacing manual chapter authoring for podcasters, long-form creators, and clip agencies.
  • Runs an AI-powered channel analysis that synthesizes video metrics, engagement patterns, and transcript content themes into a strategic report with upload consistency scores, audience engagement scores, and growth recommendations.

No YouTube Data API key is required. No browser is launched. All data is fetched via direct HTTP requests, parsed from YouTube's server-rendered ytInitialData JSON payload.

Features

  • No browser required — uses direct HTTP requests to fetch YouTube pages, making it significantly faster and cheaper than browser-based scrapers
  • 3 scraping modes — channel analysis, video search, or single video details to match your research workflow
  • Full channel metadata — subscriber count, total views, video count, channel description, verification status, country, and handle
  • Rich video data — title, view count, like count, comment count, duration, tags, publish date, engagement rate, and Shorts detection
  • Engagement rate calculation — automatically computes engagement rate (likes + comments / views) for every video
  • Comment extraction — scrape top comments with author name, comment text, like count, and publish date
  • Video transcripts — download full video subtitles via YouTube's timedtext endpoint (prefers human-uploaded, falls back to auto-generated ASR). Includes both plain-text and per-segment timestamped output. Language preference is configurable
  • AI chapter auto-segmentation — feed transcripts to an LLM to synthesize 3-8 chapter markers per video with start, end, title, and summary. Replaces manual chapter markers for podcasters and long-form creators, and produces searchable timestamps for clip extraction, content search, and video navigation. Only charged when chapters are successfully generated
  • Sort flexibility — sort channel videos by most popular, newest, or oldest to focus your analysis
  • Transcript-aware AI analysis — when transcripts are enabled alongside AI analysis, the report extracts themes, key phrases, and voice-style from actual spoken content instead of only titles
  • AI channel analysis — optional LLM-powered insights covering content strategy, upload consistency scoring, audience engagement scoring, top-performing themes, and growth recommendations
  • Multi-LLM support — choose OpenRouter (recommended — 300+ models), Anthropic (Claude), Google AI (Gemini), OpenAI (GPT), or Ollama (self-hosted) for AI analysis
  • Pay-per-event pricing — only pay for videos and transcripts you scrape, no monthly fees

Use Cases

  • Marketing agencies — research competitor channels, benchmark engagement rates across your client's vertical, and build data-driven content strategies. Compare multiple channels side-by-side with AI-generated positioning insights.
  • Content creators — analyze your own channel performance to identify top-performing content themes. Understand which video formats, topics, and lengths drive the most engagement. Optimize upload timing and strategy.
  • Brand managers — evaluate influencer channels for sponsorship fit. Track engagement rates, audience sentiment through comments, and content consistency before committing marketing budgets.
  • Influencer marketing platforms — build and maintain influencer databases with up-to-date metrics. Score channels by engagement quality, not just subscriber count. Detect fake engagement through metric analysis.
  • Competitive intelligence teams — monitor competitor YouTube channels for new content, messaging changes, product announcements, and audience reactions. Track engagement trends over time.
  • PR and communications professionals — monitor brand mentions and sentiment across YouTube comments. Track how product launches and announcements are received by video audiences.
  • Academic researchers — collect structured YouTube data for media studies, audience behavior research, content virality analysis, and platform ecosystem studies. Transcript mode produces a searchable corpus of spoken content across channels for qualitative and NLP analysis.
  • AI training and content analysts — use transcript mode to build captioned video datasets for fine-tuning, RAG pipelines, or semantic search. Combine with AI analysis for theme tagging and voice-style classification at scale.
  • SEO and content strategists — analyze video tags, titles, and descriptions to understand keyword strategies. Identify content gaps and high-engagement topics in your niche.
  • Podcasters and long-form creators — auto-generate chapter markers for every episode when you forgot (or didn't want) to write them manually. Upload the chapters to YouTube's description, surface them in your podcast RSS, or drive a "jump to section" UI on your show page.
  • Video clip agencies and editors — use AI chapter output to slice long-form content into topic-coherent clips for shorts, reels, or social distribution without watching the full video first.

Input

Parameters

ParameterTypeDefaultDescription
modestringchannelScraping mode: channel, search, or video
channelUrlsarray--YouTube channel URLs (required for channel mode)
searchQuerystring--Search query (required for search mode)
videoUrlsarray--Video URLs (required for video mode)
maxVideosinteger20Maximum videos per channel or search (1-100)
includeCommentsbooleanfalseScrape top comments for each video
maxCommentsPerVideointeger10Comments per video (1-50)
includeTranscriptsbooleanfalseDownload video subtitles (human-uploaded preferred, ASR fallback)
transcriptLanguagesarray["en"]Preferred transcript language codes (first-match priority)
generateChaptersbooleanfalseAuto-segment each transcript into 3-8 AI-generated chapters (requires includeTranscripts + LLM key)
sortVideosBystringpopularSort channel videos: popular, newest, oldest
enableAiAnalysisbooleanfalseEnable AI channel analysis
llmProviderstringopenrouterAI provider: openrouter, anthropic, google, openai, or ollama
llmModelstring--Override default model (leave empty for recommended default)
openrouterApiKeystring--OpenRouter API key (required if using OpenRouter)
anthropicApiKeystring--Anthropic API key (required if using Anthropic)
googleApiKeystring--Google AI (Gemini) API key (required if using Google)
openaiApiKeystring--OpenAI API key (required if using OpenAI)
ollamaBaseUrlstringhttp://localhost:11434Ollama API base URL (for self-hosted LLM)
proxyConfigurationobject{useApifyProxy: true, groups: [RESIDENTIAL]}Proxy settings (RESIDENTIAL strongly recommended)

Pricing

This actor uses Apify's pay-per-event pricing model. You only pay for what you scrape.

EventPriceDescription
video-scraped$0.003Charged per video extracted
transcript-scraped$0.005Charged per transcript successfully downloaded (only when includeTranscripts is enabled)
chapter-generated$0.01Charged per video chapter-set successfully generated (only when generateChapters is enabled and chapters were produced)
ai-analysis-completed$0.05Charged per AI channel analysis report

Cost Examples

ScenarioVideosTranscriptsChaptersAI AnalysisTotal Cost
Quick channel check10NoNoNo$0.03
Channel deep dive20NoNoYes$0.11
Channel deep dive + transcripts20Yes (20)NoYes$0.21
Podcast chapter generation20Yes (20)Yes (20)No$0.36
Full content intelligence pass20Yes (20)Yes (20)Yes$0.41
Multi-channel comparison50NoNoYes$0.20
Multi-channel + transcripts50Yes (50)NoYes$0.45
Large content audit100NoNoYes$0.35

Typical Runtime

Because this actor uses HTTP requests instead of a browser, it runs significantly faster than browser-based YouTube scrapers:

  • 10 videos without comments: ~30-60 seconds
  • 20 videos without comments: ~1-2 minutes
  • 20 videos with comments: ~2-4 minutes
  • 50 videos with comments: ~5-8 minutes
  • AI analysis adds ~15-30 seconds to any run

Output

Channel Mode Output

In channel mode, each result contains channel metadata, video list, and optional AI analysis:

{
"channel": {
"channel_id": "UCBcRF18a7Qf58cCRy5xuWwQ",
"channel_name": "MKBHD",
"handle": "@mkbhd",
"subscriber_count": 19200000,
"total_videos": 20,
"total_views": 450000000,
"description": "...",
"verified": true,
"country": "US"
},
"videos": [
{
"video_id": "abc123",
"title": "Video Title",
"view_count": 5000000,
"like_count": 200000,
"comment_count": 15000,
"duration_seconds": 720,
"engagement_rate": 4.3,
"is_short": false,
"tags": ["tech", "review"],
"comments": [
{
"author": "User",
"text": "Great video!",
"likes": 500,
"published_date": "2 weeks ago"
}
]
}
],
"ai_analysis": {
"channel_overview": "...",
"upload_consistency_score": 8,
"audience_engagement_score": 9,
"top_performing_themes": ["tech reviews", "smartphones"],
"recommendations": ["..."]
},
"scraped_at": "2026-04-10T10:30:00Z"
}

Search and Video Mode

In search and video mode, each result is a flat video object with the following fields:

{
"videoId": "dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up (Official Video)",
"description": "The official video for "Never Gonna Give You Up" by Rick Astley...",
"channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
"channelTitle": "Rick Astley",
"viewCount": 1500000000,
"likeCount": 16000000,
"commentCount": 2100000,
"publishedAt": "2009-10-25T06:57:33Z",
"duration_seconds": 212,
"engagement_rate": 1.21,
"is_short": false,
"tags": ["rick astley", "never gonna give you up", "pop"],
"transcript": [
{"start": 0.0, "duration": 3.5, "text": "We're no strangers to love..."},
{"start": 3.5, "duration": 4.0, "text": "You know the rules and so do I..."}
],
"chapters": [
{"start": 0, "end": 43.0, "title": "Opening verse", "summary": "Artist introduces the emotional stakes of commitment."},
{"start": 43.0, "end": 212.0, "title": "Chorus and bridge", "summary": "Repeated declaration of unconditional loyalty."}
]
}

The transcript field is an array of timed segments (only present when includeTranscripts is enabled). The chapters array is only present when generateChapters is enabled and the LLM successfully segmented the transcript.

Transcript Output

When includeTranscripts is enabled, each video record also includes:

{
"transcript_available": true,
"transcript_language": "en",
"transcript_kind": "",
"transcript_text": "Welcome back to the channel. Today we're looking at...",
"transcript_segments": [
{"start": 0.0, "duration": 4.12, "text": "Welcome back to the channel."},
{"start": 4.12, "duration": 3.8, "text": "Today we're looking at..."}
]
}
  • transcript_kind is "" for human-uploaded captions and "asr" for auto-generated (machine-transcribed) captions. The scraper prefers uploaded over ASR and matches your preferred-language list first.
  • transcript_segments preserves timestamps for use cases like chaptering, search-within-video, or clipping. transcript_text is the flat concatenation for full-text search or LLM input.
  • When no captions are available on a video, transcript_available is false and the other fields are empty.

AI Chapter Output

When generateChapters is enabled (and an LLM key is set and the transcript was retrieved), each video record gains a chapters array:

{
"chapters": [
{"start": 0, "end": 42.5, "title": "Intro and new studio setup", "summary": "Creator welcomes viewers and introduces the revamped studio."},
{"start": 42.5, "end": 310.0, "title": "iPhone 16 Pro camera test", "summary": "Walk-through of the new 48MP main sensor with outdoor comparison shots."},
{"start": 310.0, "end": 720.0, "title": "Battery life and benchmarks", "summary": "Real-world battery test plus synthetic CPU/GPU benchmarks vs. last year's model."}
]
}
  • Timestamps are in seconds (float), so they map directly to YouTube's URL fragment format (&t=310s).
  • Typical output is 3-8 chapters per video — the LLM decides the optimal count based on topic shifts in the transcript.
  • Chapters are chronological, non-overlapping, and cover the whole video from 0 to duration.
  • If the video has no transcript (transcript_available: false), no chapters are generated and no charge is emitted.
  • On rare LLM parse failures, the chapters field is simply absent and no charge is emitted — your metrics + transcripts are unaffected.

Quick Start

Example: Auto-chapter a long-form podcast (highest-value mode)

Pulls the timed transcript, generates AI chapter timestamps with summaries, and runs sentiment analysis. Best for podcasts, interviews, lectures (videos >10 min).

{
"mode": "video",
"videoUrls": ["https://www.youtube.com/watch?v=YOUR_PODCAST_VIDEO_ID"],
"includeTranscript": true,
"generateChapters": true,
"enableAiAnalysis": true,
"aiProvider": "openrouter"
}

Per video: $0.003 (video) + $0.005 (transcript) + $0.01 (chapters) + $0.05 (AI) = ~$0.068/video. 30-min weekly podcast season (10 episodes): $0.68.

Example: Quick Channel Check

The simplest way to get started — analyze a YouTube channel's top videos:

{
"mode": "channel",
"channelUrls": ["https://www.youtube.com/@mkbhd"],
"maxVideos": 10
}

This scrapes the 10 most popular videos from MKBHD's channel with full metrics.

Example: Scrape Latest Channel Videos with Transcripts

Fetch the 15 newest uploads from a channel and download the English transcript for each video — useful for content monitoring pipelines that need the spoken text, not just titles.

{
"mode": "channel",
"channelUrls": ["https://www.youtube.com/@lexfridman"],
"maxVideos": 15,
"sortVideosBy": "newest",
"includeTranscripts": true,
"transcriptLanguages": ["en"]
}

Each video item includes transcript_text (flat string) and transcript_segments (timestamped array) alongside standard metadata such as view count, like count, and duration.

Example: Search YouTube for a Keyword

Search for a topic and retrieve the top 20 matching videos with full metadata — ideal for market research or tracking which creators dominate a niche.

{
"mode": "search",
"searchQuery": "electric vehicle review 2026",
"maxVideos": 20
}

Returns a list of 20 video objects, each containing title, channel name, view count, like count, engagement rate, duration, and publish date.

Example: Weekly Competitor Channel Analysis

Analyze a competitor's 50 most recent videos and generate an AI report on their content strategy — schedule this weekly to track messaging shifts and upload cadence over time.

{
"mode": "channel",
"channelUrls": ["https://www.youtube.com/@mkbhd"],
"maxVideos": 50,
"sortVideosBy": "newest",
"enableAiAnalysis": true,
"llmProvider": "openrouter",
"openrouterApiKey": "sk-or-..."
}

Produces a structured AI report covering upload consistency score, audience engagement score, top content themes, and growth recommendations alongside full video metrics for all 50 videos.

Tips for Best Results

  • Sort by popular for content strategy. Analyzing a channel's most popular videos reveals what content themes and formats resonate most with their audience.
  • Sort by newest for competitive monitoring. Track what competitors are publishing right now and how their recent content performs.
  • Use comments for sentiment analysis. Top comments provide qualitative audience feedback that complements quantitative engagement metrics.
  • Enable AI analysis for strategic insights. The AI report synthesizes video metrics, engagement patterns, and content themes into actionable content strategy recommendations.
  • Combine transcripts with AI analysis. When both are enabled, the AI report adds transcript-derived themes, key phrases, and creator voice-style — surfacing what the creator actually talks about, not just title keywords. This is the most actionable config for sponsorship research, brand-fit evaluation, and content gap analysis.
  • Use generateChapters for long-form content. Auto-chaptering shines on podcasts, interviews, tutorials, and explainers where topic shifts matter. For short videos (<2 min) or single-topic content, chapters add little value — skip the feature and save the $0.01 per video.
  • Compare engagement rates, not just views. A channel with 100K views and 5% engagement is often more valuable for sponsorships than one with 1M views and 0.5% engagement.
  • Schedule weekly runs for ongoing monitoring. Track how channels evolve their content strategy and audience engagement over time.

MCP Quickstart — call this actor from Claude / Cursor / ChatGPT

Install the Apify MCP server in your AI agent of choice:

# Claude Code
claude mcp add apify -- npx -y @apify/actors-mcp-server --token YOUR_APIFY_TOKEN
# Claude Desktop / Cursor (add to mcp.json):
{"mcpServers":{"apify":{"command":"npx","args":["-y","@apify/actors-mcp-server","--token","YOUR_APIFY_TOKEN"]}}}

Then prompt the agent:

"Use the harvestlab/youtube-scraper actor on Apify to fetch the 30 most recent videos from the @LexClips channel with timed transcripts and AI-generated chapters. Push the results back as JSON."

The agent will discover the actor's dataset_schema.json, generate the right input, run it, and pipe the typed output back into your conversation.

Troubleshooting

No transcripts returned for a video YouTube restricts transcripts on some videos (auto-captions disabled by the creator, music-industry content, age-gated, member-only, or live streams in progress). The video item is returned with transcript_available: false and empty transcript_text / transcript_segments — and the transcript-scraped event is not charged in this case. Workaround: set includeTranscripts: false to skip the transcript fetch entirely for faster runs, or target channels known to have captions enabled.

403/429 errors or empty results This actor does not use the YouTube Data API — it scrapes YouTube's public web pages (ytInitialData JSON) directly. 403 / 429 responses come from YouTube rate-limiting your exit IP, not from an API quota. Datacenter proxies are flagged most aggressively. Add proxyConfiguration with useApifyProxy: true and apifyProxyGroups: ["RESIDENTIAL"] to rotate through residential exits. If you still see 429s, reduce maxVideos per run and split large jobs across multiple scheduled runs to spread the load over time.

AI analysis fails with "API key missing" Set openrouterApiKey (or your chosen provider's key) in the actor input, or pass it via the matching env var (OPENROUTER_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, OPENAI_API_KEY). AI analysis is optional — leave enableAiAnalysis: false (the default) to skip it without losing other data. Ollama uses ollamaBaseUrl and needs a reachable local server instead.

Channel scraping returns fewer videos than expected YouTube paginates channel videos in batches of ~30. The actor follows continuation tokens up to maxVideos (max 100). For channels with fewer total uploads than maxVideos, you simply get all available videos — this is expected, not a failure. Set sortVideosBy to newest or oldest if you need a specific slice instead of the popularity ranking. Very large channels (10,000+ videos) hit rate limits faster — combine residential proxies with multiple smaller scheduled runs.

Videos returning with no or incomplete metadata Private, age-restricted, or member-only videos expose limited data even when visible in search or channel listings. Some videos surface tags only via YouTube's InnerTube API rather than the page payload — the actor falls back to InnerTube automatically with per-attempt IP rotation, but a residential proxy group materially improves coverage. If a specific video consistently returns empty fields, it is likely restricted at the source and cannot be fully scraped regardless of proxy choice.

Transcript not available for this video The video may have auto-captions disabled, or the language you requested is not available. Check the transcriptLanguages parameter (note: plural) — if you set ["en"] but the channel primarily publishes in another language, add fallback codes (e.g. ["en", "es", "fr"]). The first matching track in priority order wins; if none match, the first available track is returned. Music videos, live streams in progress, and member-only content routinely have no accessible captions; the actor returns transcript_available: false for these and does not charge the transcript-scraped event.

AI chapters are missing even though transcripts succeeded Chapter generation requires includeTranscripts: true, generateChapters: true, and a working LLM key for the chosen llmProvider. If the LLM call fails or returns invalid JSON, the chapters field is silently omitted and no chapter-generated charge is emitted — your video metrics and transcripts are unaffected. Check the run log for chapter generation failed warnings, then verify your API key and credit balance with the provider.

Rate limiting or quota exceeded errors YouTube applies rate limits at both the IP and session level. Datacenter IPs hit these limits faster than residential ones. Add a proxyConfiguration block with useApifyProxy: true and groups: ["RESIDENTIAL"] to route requests through rotating residential exits. If you are already using residential proxies and still seeing 429 responses, reduce maxVideos per run and split large jobs across multiple scheduled runs to spread the load over time.

Known Limitations

  • YouTube may rate-limit requests for large scraping jobs; the actor retries up to 3 times per page
  • Comment scraping retrieves top comments only (sorted by YouTube's relevance algorithm), not all comments
  • Subscriber counts and view counts use YouTube's abbreviated format (e.g., "1.2M") which provides approximate values
  • YouTube Shorts metrics may be less complete than long-form video metrics
  • YouTube's page structure may change; the actor handles multiple layout variations but temporary disruptions are possible
  • AI analysis quality depends on the chosen model and the number of videos analyzed (15+ recommended)
  • Maximum 100 videos per channel per run
  • Private or age-restricted videos cannot be scraped
  • Tags coverage is partial on datacenter proxies. YouTube selectively strips the keywords field from video pages returned to IPs it flags as suspicious. The actor falls back to YouTube's InnerTube API (with per-attempt IP rotation) to recover tags, but some videos may still return empty tags: []. For maximum tag coverage, use a residential proxy group. Note that many large channels (e.g. MrBeast) genuinely set no tags on their videos — an empty tags array in those cases is the correct result, not a scraping failure.
  • Transcripts are not available on every video. Private, age-restricted, member-only, and music-industry videos typically disable captions. Music videos often have ASR-only captions that transcribe background lyrics. Live streams usually have no captions while live and limited ASR after the stream ends. The actor returns transcript_available: false and empty strings/arrays when no captions are accessible — this is not a failure.
  • AI chapters depend on transcript quality. When the transcript is an ASR auto-generation, segmentation follows whatever the ASR heard — filler-heavy, stream-of-consciousness videos may produce fewer or less-precise chapters than tightly-scripted content. Very short videos (<2 minutes) may collapse into a single chapter, which is suppressed (min 3 chapters). On rare LLM parse failures the chapters field is absent and no charge is emitted.

Frequently Asked Questions

What transcripts does this return? YouTube videos typically have one or both of: human-uploaded captions (manually created by the uploader, highest accuracy) and auto-generated ASR captions (machine transcribed, available on most public videos in many languages). The scraper returns the best match for your transcriptLanguages preference, preferring uploaded over ASR. The transcript_kind field tells you which was returned. Paid/private captions and closed captions locked behind DRM are not accessible.

How do AI chapters work? When generateChapters is enabled, the actor passes each video's timestamped transcript to your chosen LLM with a strict JSON schema prompt. The model identifies topic-shift boundaries and returns 3-8 chapters with start / end seconds, a concise title, and a 1-sentence summary. This is essentially automatic chapter markers for videos where the creator didn't write them — useful for podcasters publishing to YouTube, agencies clipping long-form content, or anyone who wants searchable timestamps without manual authoring. Requires both includeTranscripts=true and an LLM API key; you are charged $0.01 per video only when chapters are successfully produced.

Why is this scraper faster and cheaper than others? This actor uses direct HTTP requests (httpx) to fetch YouTube pages instead of launching a browser. It parses YouTube's server-rendered ytInitialData JSON payload, which contains all the structured data needed. No Playwright or Chromium overhead means faster runs and lower compute costs.

Can I scrape multiple channels in one run? Yes. In channel mode, provide multiple URLs in the channelUrls array. Each channel will be scraped sequentially with up to maxVideos videos per channel. AI analysis is generated per channel.

How is engagement rate calculated? Engagement rate is calculated as (like count + comment count) / view count, expressed as a percentage. This metric normalizes for audience size, allowing fair comparison between channels of different scales.

Does this scraper detect YouTube Shorts? Yes. Each video includes an is_short boolean field that identifies whether the content is a YouTube Short. This allows you to filter or analyze Shorts separately from long-form content.

Can I search for videos by keyword? Yes. Use search mode with a searchQuery parameter. For example, search for "product review 2026" to find recent review content. Search mode returns videos from across YouTube, not limited to specific channels.

How do I track channel performance over time? Schedule regular runs on Apify (weekly or monthly) for the same channels. Over time, you build a dataset showing subscriber growth, engagement trends, content theme shifts, and upload frequency changes.

Use with AI agents (LangChain & LangGraph)

Outputs from this actor are agent-ready: video metadata, timed transcripts, and AI chapters are returned as structured JSON, so you can plug a run directly into a LangChain Tool or a LangGraph StateGraph node without post-processing.

LangChain — wrap the actor as a Tool

from apify_client import ApifyClient
from langchain.tools import Tool
client = ApifyClient("YOUR_APIFY_TOKEN")
def youtube_scraper(channel_handle: str) -> list[dict]:
run = client.actor("harvestlab/youtube-scraper").call(run_input={
"mode": "channel",
"channelUrls": [f"https://www.youtube.com/{channel_handle}"],
"maxVideos": 20,
"includeTranscripts": True,
})
return list(client.dataset(run["defaultDatasetId"]).iterate_items())
youtube_tool = Tool(
name="youtube_scraper",
description="Fetch YouTube channel videos + transcripts. Input: a channel handle like '@mkbhd'.",
func=youtube_scraper,
)
# Agent calls it: youtube_tool.invoke({"channelHandle": "@mkbhd"})

LangGraph — call the actor inside a StateGraph node

from typing import TypedDict
from apify_client import ApifyClient
from langgraph.graph import StateGraph, END
client = ApifyClient("YOUR_APIFY_TOKEN")
class State(TypedDict):
channelHandle: str
transcripts: list[dict]
def fetch_youtube(state: State) -> State:
run = client.actor("harvestlab/youtube-scraper").call(run_input={
"mode": "channel",
"channelUrls": [f"https://www.youtube.com/{state['channelHandle']}"],
"maxVideos": 10,
"includeTranscripts": True,
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
return {**state, "transcripts": items}
graph = StateGraph(State)
graph.add_node("fetch_youtube", fetch_youtube)
graph.set_entry_point("fetch_youtube")
graph.add_edge("fetch_youtube", END)
app = graph.compile()

See also apify/actor-templates/js-langchain and js-langgraph-agent for full template scaffolds in JavaScript.

This actor scrapes publicly available data. By using this actor, you agree to the following:

  • Your responsibility: You are solely responsible for ensuring your use complies with all applicable laws, regulations, and the target website's terms of service. This includes but is not limited to GDPR (EU), CCPA (California), and other data protection laws in your jurisdiction.
  • No legal advice: This actor does not constitute legal advice. Consult a qualified attorney if you have questions about the legality of your specific use case.
  • Intended use: This actor is designed for legitimate business purposes such as market research, competitive analysis, and academic research using publicly accessible data.
  • Data handling: You are responsible for how you store, process, and share any data collected. Ensure you have a lawful basis for processing any personal data under applicable privacy laws.
  • Rate limiting: This actor implements polite crawling practices including request delays and retry backoff to minimize impact on target servers.
  • No warranty: This actor is provided "as is" without warranty. Data accuracy depends on the target website's content and structure.
  • YouTube data: YouTube's terms of service restrict automated data collection. Consider using the official YouTube Data API for production use cases. This actor is intended for analytics and research purposes.
  • Personal data notice: Channel and video data may include creator names, profile images, and commenter usernames. Under GDPR and similar regulations, this constitutes personal data subject to data protection requirements. Ensure you have a lawful basis for processing. Do not use extracted data for unsolicited contact or harassment.
  • Google News Monitor — Pair YouTube creator coverage with Google News article tracking for cross-channel media monitoring on the same topic, brand, or industry event.
  • Reddit Scraper — Capture community discussion of the videos you're tracking; cross-reference YouTube engagement with Reddit thread sentiment for a fuller audience-reaction picture.
  • ProductHunt Scraper — Track creator-economy product launches and creator tools alongside YouTube channel analytics for end-to-end creator and launch intelligence.