![YouTube Transcript Master [EASY] avatar](https://images.apifyusercontent.com/DYUbuv5_7XlxAB1wT43vwl42yHUcYAXA52qOeJnF2P4/rs:fill:250:250/cb:1/aHR0cHM6Ly9hcGlmeS1pbWFnZS11cGxvYWRzLXByb2QuczMudXMtZWFzdC0xLmFtYXpvbmF3cy5jb20veUl2ZXZPMVkzYUY0bmJxZjYtYWN0b3ItY29XV1NEd1N4bUloVEQzTGEtSUlaYW9Wc2NOTi1Hcm91cF8zNy5wbmc.webp)
YouTube Transcript Master [EASY]
Pricing
$37.50/month + usage
![YouTube Transcript Master [EASY]](https://images.apifyusercontent.com/DYUbuv5_7XlxAB1wT43vwl42yHUcYAXA52qOeJnF2P4/rs:fill:250:250/cb:1/aHR0cHM6Ly9hcGlmeS1pbWFnZS11cGxvYWRzLXByb2QuczMudXMtZWFzdC0xLmFtYXpvbmF3cy5jb20veUl2ZXZPMVkzYUY0bmJxZjYtYWN0b3ItY29XV1NEd1N4bUloVEQzTGEtSUlaYW9Wc2NOTi1Hcm91cF8zNy5wbmc.webp)
YouTube Transcript Master [EASY]
YouTube Transcripts in BULK! Easily query via channel, playlist, or video URLs. Built with simplicity & reliability in mind, with expert support. Perfect data to feed your AI or LLM. Output multiple formats: TEXT, JSON, SRV, TTML, VTT (WebVTT). Automatic YouTube captions are available as backup.
0.0 (0)
Pricing
$37.50/month + usage
1
Total users
1
Monthly users
1
Last modified
2 days ago
📜 YouTube Transcript Master
This Actor extracts transcripts from YouTube videos a mix of sources like: channel URLs, or playlist URLs, and video URLs. It fetches video metadata (title, duration, upload date) and transcripts in specified formats and languages.
⭐ Paying users will enjoy recieve quick responses messages, expert support for issues, as well as feature or Actor requests
Why the high price?
This a robust Actor built for high volume and bulk dataset for long term use cases. This way I can support the long term users of this Actor more effectively rather than on a "per result" basis. If you would like you may try the lite version of this Actor here: YouTube Transcript Master [Lite]
🌟 Features
- Retrieve transcripts in bulk with a mix of inputs like: YouTube channels, playlists, or individual video links.
- Extracts video title, duration, and upload date.
- Fetches transcripts in multiple formats (Plain Text, JSON, VTT, TTML, etc.).
- Supports specifying the desired transcript language.
- Uses efficient HTML scraping minimizing CPU usage.
- Includes enhanced retry logic for network requests to improve reliability.
- Optionally uses Apify's residential proxies for consistent results.
📊 Usage & Estimated Costs
Understanding the potential costs associated with running this Actor is important. Below is an example based on a test run processing 500
videos to retrieve plainText
transcripts. Please note that actual costs can vary based on several factors.
Test Run Statistics:
- Videos Processed: 500
- Transcript Format Requested:
plainText
- Total Combined Video Length: ~380 hours
- Average Video Length: ~45 minutes
- Success Rate: 100% (0 missed results or errors)
Estimated Cost Breakdown (according to test run):
Cost Category | Estimated Cost | Notes |
---|---|---|
Actor Compute Units | $1.358 | Cost for the server resources used to run the Actor's code. |
Proxy (Residential) | $1.194 | Cost for using Apify's residential proxies to avoid blocks by YouTube. |
Other (Platform Fees) | $0.004 | Minimal platform usage fees. |
Total Estimated Cost | $2.56 |
Factors Influencing Costs:
- Number of Videos: The primary driver of cost. More videos mean longer compute time and potentially more proxy usage.
- Transcript Formats: Requesting multiple formats per video might slightly increase compute time and proxy usage if different underlying fetches are needed.
- Proxy Usage: Disabling the proxy (
useApifyProxy: false
) eliminates proxy costs but significantly increases the risk of being blocked by YouTube, leading to failed runs and potentially higher compute costs due to retries or incomplete runs. Using the proxy is highly recommended.
Important Considerations:
- The costs shown are estimates based on a specific run and current Apify pricing. Your actual costs may differ.
- While the Actor includes robust error handling and retries, occasional failures for specific videos (due to missing transcripts, YouTube errors, private videos, or persistent proxy issues) can still occur. Check the output dataset's
error
field and Actor logs for details on any failed items.
⚙️ Input
The actor takes a JSON object as input with the following properties:
Field | Type | Description | Required | Default |
---|---|---|---|---|
sources | Array | A list of YouTube URLs. Can include video URLs (watch?v= ), playlist URLs (playlist?list= ), or channel URLs (@handle , /c/ , /user/ ). | Yes | [] |
formats | Object | Specifies which transcript formats to retrieve. Set the desired format key(s) to true . | No | { "plainText": true, "json": false, "json3": false, "srv1": false, "srv2": false, "srv3": false, "ttml": false, "vtt": false } |
language | String | The two-letter language code for the desired transcript (e.g., en , es , de ). | No | "en" |
useApifyProxy | Boolean | If true , uses Apify's residential proxies for fetching. Highly recommended to avoid blocks. If false , uses the container's direct IP. | No | true |
delayBetweenVideos | Integer | An optional delay in milliseconds to wait between processing each video. Useful for mitigating rate limits on very large lists. | No | 0 |
Input Example
1{ 2 "sources": [ 3 "https://www.youtube.com/watch?v=b_nep8vMnkc", 4 "https://www.youtube.com/@Apify", 5 "https://www.youtube.com/playlist?list=PLObrtcm1Kw6PEnu5BpeEFb8XEoQXMw0g7" 6 ], 7 "formats": { 8 "plainText": true, // Get one or many formats 9 "json": true, 10 "json3": false, 11 "srv1": false, 12 "srv2": false, 13 "srv3": false, 14 "ttml": false, 15 "vtt": false 16 }, 17 "language": "en", // Two letter language codes: 'es', 'fr', 'pl', etc. 18 "useApifyProxy": true, 19 "delayBetweenVideos": 0 // Milliseconds 20}
📥 Output
The actor outputs data to the default Apify dataset. Each item in the dataset represents a processed video and has the following structure:
Field | Type | Description |
---|---|---|
url | String | The original YouTube video URL processed. |
language | String | The language code requested for the transcript. |
title | String | The title of the video (N/A or 'Processing Error' if extraction failed). |
duration | Integer | The duration of the video in seconds (0 or -1 if extraction failed). |
videoLink | String | Same as url , provided for convenience. |
uploadDate | String |null | The publication date of the video (YYYY-MM-DD format), if found. |
text | String |null | The plain text transcript, if requested and found (with improved handling for \" ). |
json | Array |null | Structured JSON transcript (start, duration, text per segment), if requested and found. |
json3 | Object |null | The raw JSON3 format transcript, if requested and found. |
srv1 | String |null | The raw SRV1 (XML) format transcript, if requested and found. |
srv2 | String |null | The raw SRV2 (XML) format transcript, if requested and found. |
srv3 | String |null | The raw SRV3 (XML) format transcript, if requested and found. |
ttml | String |null | The raw TTML format transcript, if requested and found. |
vtt | String |null | The raw VTT format transcript, if requested and found. |
error | String |null | An error message if processing the video or fetching transcripts failed for this video. |
details | String |null | Additional error details (like HTTP status code or stack trace) if an error occurred. |
Output Example (Success - plainText & json requested)
1{ 2 "url": "https://www.youtube.com/watch?v=b_nep8vMnkc", 3 "language": "en", 4 "title": "Example Video Title", 5 "duration": 123, 6 "videoLink": "https://www.youtube.com/watch?v=b_nep8vMnkc", 7 "uploadDate": "2023-10-27", 8 "text": "This is the extracted plain text transcript...", 9 "json": [ 10 { "text": "Segment 1 text", "start": 0.5, "dur": 3.2 }, 11 { "text": "Segment 2 text", "start": 3.7, "dur": 2.8 } 12 ], 13 "json3": null, 14 "srv1": null, 15 "srv2": null, 16 "srv3": null, 17 "ttml": null, 18 "vtt": null, 19 "error": null, 20 "details": null 21}
Output Example (Failure - general processing error)
1{ 2 "url": "https://www.youtube.com/watch?v=invalidVideoId", 3 "language": "en", 4 "title": "Processing Error", 5 "duration": -1, 6 "videoLink": "https://www.youtube.com/watch?v=invalidVideoId", 7 "uploadDate": null, 8 "text": null, 9 "json": null, 10 "json3": null, 11 "srv1": null, 12 "srv2": null, 13 "srv3": null, 14 "ttml": null, 15 "vtt": null, 16 "error": "Failed to process: Failed to extract valid playerResponse from HTML after 2 attempts.", 17 "details": "Error: Failed to extract valid playerResponse from HTML after 2 attempts.\n at processVideos (/actor/main.js:497:23)\n at ..." 18}
📝 Usage Notes
- Recent Uploads: Recently upload videos may not have transcripts right away.
- Private or Deleted Videos: Keep in mind these videos will result in errors.
- Live Streams: Processing live or upcoming streams may result in errors or incomplete data, as the necessary transcript information might not be available.
- Large Transcripts & Multiple Formats: Apify has a 9.4mb limit for each item (transcript), keep this in mind when using multiple formats and large transcripts.
💬 Support
If you encounter issues or have questions, contact me @ https://x.com/t_zerohour.