YouTube Transcript Tool Server (MCP-compatible) avatar

YouTube Transcript Tool Server (MCP-compatible)

Under maintenance

Pricing

from $0.01 / 1,000 results

Go to Apify Store
YouTube Transcript Tool Server (MCP-compatible)

YouTube Transcript Tool Server (MCP-compatible)

Under maintenance

Fetch YouTube video transcripts + metadata via tool-style invocations using public endpoints (no API key). Tools: video_transcript, video_metadata, channel_videos, playlist_videos.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Rara21

Rara21

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Apify YouTube Transcript Tool Server (MCP-style)

An Apify Actor that exposes YouTube transcripts + metadata as a small set of tools an LLM agent can call — one Actor run = one tool call. Uses only public endpoints: the watch page HTML, the timedtext caption API, the oEmbed endpoint, and the channel/playlist RSS feeds. No API key required.

Tools

ToolPurpose
video_transcriptFetch a video's caption track as timestamped segments + merged full text
video_metadataTitle, author, thumbnail (via oEmbed)
channel_videosLatest videos from a channel (RSS feed)
playlist_videosVideos from a public playlist (RSS feed)

Input

{
"tool": "video_transcript",
"args": { "video_id": "dQw4w9WgXcQ", "language": "en", "allow_auto": true }
}

Example output: video_transcript

{
"video_id": "dQw4w9WgXcQ",
"language": "en",
"language_name": "English",
"is_auto_generated": false,
"segments": [
{ "start_s": 0.0, "duration_s": 2.0, "text": "Hello world." },
{ "start_s": 2.0, "duration_s": 1.5, "text": "Welcome to the talk." }
],
"full_text": "Hello world. Welcome to the talk."
}

If no captions are available the response is { video_id, not_found: true, reason: "no_transcript_available" }.

Example output: channel_videos

{
"channel_id": "UCsomechannel",
"returned": 15,
"videos": [
{
"video_id": "vidaaa11111",
"title": "First Video",
"author": "Sample Channel",
"published_at": "2026-05-01T00:00:00+00:00",
"watch_url": "https://www.youtube.com/watch?v=vidaaa11111"
}
]
}

Run locally

npm install
npm run build
apify run --input-file=./examples/transcript.json

Tests

npm test

22 tests across test/client.test.ts (watch-page parser + caption picker + json3 parser + RSS parser + oEmbed + retry) and test/tools.test.ts (the four tools end-to-end with a mocked fetch).

How transcript extraction works

  1. Fetch https://www.youtube.com/watch?v=<id> with a normal browser UA.
  2. Parse the ytInitialPlayerResponse JSON embedded in the page.
  3. Walk captions.playerCaptionsTracklistRenderer.captionTracks[] and pick the best match for the requested language (or fall back to the first available, optionally skipping auto-generated tracks).
  4. Fetch the picked track's baseUrl with &fmt=json3 to get structured {events: [{tStartMs, dDurationMs, segs}]} data.
  5. Flatten + clean into [{ start_s, duration_s, text }].

Because this is a scrape of public surfaces, YouTube can break it. The actor logs the failure mode and returns not_found with a reason rather than crashing.

Architecture

src/
main.ts Apify entry — reads input, dispatches to runTool
tools/index.ts Single dispatcher; one async runner per tool
youtube-client.ts Watch-page scraper + json3 parser + oEmbed + RSS parser
types.ts Zod schemas for input + outputs
test/
fixtures.ts Realistic watch HTML + transcript + RSS + oEmbed fixtures
client.test.ts 16 tests on the HTTP/parsing layer
tools.test.ts 6 tests on the dispatcher

License

MIT