YouTube & Podcast Transcript Extractor
Pricing
from $3.00 / 1,000 youtube transcripts
YouTube & Podcast Transcript Extractor
Extracts existing transcripts from YouTube videos, playlists and channels, plus podcast feeds (Podcasting 2.0). Returns LLM-ready text and timestamped segments.
Pricing
from $3.00 / 1,000 youtube transcripts
Rating
0.0
(0)
Developer
Prooflio AI
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
2
Monthly active users
11 days ago
Last modified
Categories
Share
Extracts existing transcripts and returns them as clean, LLM-ready text:
- YouTube — caption tracks (auto-generated or uploaded) from individual videos, whole playlists, or entire channels
- Podcasts — transcripts declared in the RSS feed via the Podcasting 2.0
<podcast:transcript>tag (SRT, VTT, JSON, or plain text, normalized to plain text)
Each transcript becomes one dataset item with both a full plain-text transcript and optional timestamped segments. Failures are recorded per item, so one bad video never fails the whole run.
This Actor reads transcripts that already exist. It does not transcribe audio. For videos/episodes with no captions, see "Extending" below.
Input
| Field | Type | Default | Description |
|---|---|---|---|
videos | string[] | — | YouTube video URLs or 11-char IDs. |
playlists | string[] | — | Playlist URLs or IDs; each is expanded into its videos. |
channels | string[] | — | Channel URLs, @handles, or UC… IDs; uploads are expanded into videos. |
podcastFeeds | string[] | — | Podcast RSS feed URLs. |
language | string | en | Preferred language code; falls back to the first available track. |
includeSegments | boolean | true | Include timestamped segments alongside the plain text. |
maxVideosPerSource | integer | 50 | Cap on videos pulled from each playlist/channel. |
maxEpisodesPerFeed | integer | 10 | Episodes to process per feed. |
proxyConfiguration | object | Apify Proxy | Proxy settings. Residential is strongly recommended for YouTube. |
{"channels": ["https://www.youtube.com/@veritasium"],"playlists": ["https://www.youtube.com/playlist?list=PLxxxx"],"maxVideosPerSource": 25,"language": "en","proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }}
Videos collected from videos, playlists, and channels are de-duplicated before extraction, so the same video is never transcribed (or charged) twice.
Output
One item per transcript. The Output tab shows a curated Overview table (source, title, language, URL, transcript); all fields are available in the "All fields" view.
{"source": "youtube","videoId": "dQw4w9WgXcQ","url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","title": "...","language": "en","isAutoGenerated": true,"transcript": "full plain text ...","segments": [{ "start": 0.0, "duration": 3.2, "text": "..." }]}
A note on YouTube blocking
YouTube aggressively blocks datacenter IPs and may serve a bot challenge instead of the page. If you see "YouTube likely served a bot challenge", switch proxyConfiguration to a residential group. This is the single biggest reliability factor for this Actor, and it's also the main cost driver — see pricing notes below.
Legal & acceptable use
This Actor accesses publicly available caption tracks and publicly published podcast RSS transcripts. Automated access to YouTube is restricted by its Terms of Service; you are responsible for ensuring your use complies with the terms of any site you target and with applicable law and copyright. Transcripts are the intellectual property of their creators — use the output accordingly.