
Extract Youtube Transcript
Deprecated
Pricing
$5.00 / 1,000 results

Extract Youtube Transcript
Deprecated
Extract YouTube video captions & subtitles with multi-language support. Get JSON, text, SRT, or WebVTT formats. Flexible timestamp options, metadata control, and robust error handling. Perfect for content creators, researchers, and accessibility needs. 🎬✨
0.0 (0)
Pricing
$5.00 / 1,000 results
0
Total users
1
Monthly users
1
Last modified
22 days ago
YouTube Transcript Extractor
A powerful Apify actor that extracts captions and subtitles from YouTube videos with multi-language support and various output formats.
🚀 Live Actor: https://apify.com/jimbiano/extract-youtube-transcript
Features
- ✅ Extract transcripts from YouTube videos using video URL or ID
- ✅ Support for multiple languages with automatic detection
- ✅ Multiple output formats: JSON, plain text, SRT, and WebVTT
- ✅ Handle both manual and auto-generated captions
- ✅ Robust error handling for various edge cases
- ✅ Preserve or strip HTML formatting
- ✅ Detailed metadata and timestamps
- ✅ Translation support (when available)
Input Parameters
Parameter | Type | Required | Description |
---|---|---|---|
videoUrl | string | No* | YouTube video URL (e.g., https://www.youtube.com/watch?v=dQw4w9WgXcQ ) |
videoId | string | No* | YouTube video ID (e.g., dQw4w9WgXcQ ) |
languages | array | No | Preferred language codes (e.g., ["en", "es", "fr"] ) |
includeGenerated | boolean | No | Include auto-generated transcripts (default: true ) |
includeManual | boolean | No | Include manually created transcripts (default: true ) |
outputFormat | string | No | Output format: json , text , srt , vtt (default: json ) |
preserveFormatting | boolean | No | Preserve HTML formatting (default: false ) |
translateTo | string | No | Language code to translate to (optional) |
includeTimestamps | boolean | No | Include timing information (default: true ) |
timestampFormat | string | No | Timestamp format: start_only , start_end , all (default: start_only ) |
includeMetadata | boolean | No | Include metadata information (default: false ) |
simplifiedOutput | boolean | No | Return only transcript array (default: false ) |
*Either videoUrl
or videoId
must be provided.
Supported URL Formats
The actor supports various YouTube URL formats:
- Standard:
https://www.youtube.com/watch?v=VIDEO_ID
- Short:
https://youtu.be/VIDEO_ID
- Embed:
https://www.youtube.com/embed/VIDEO_ID
- Shorts:
https://www.youtube.com/shorts/VIDEO_ID
- Video ID only:
VIDEO_ID
Output Format
JSON Output Examples
Default Output (start timestamp only, no metadata)
{"success": true,"videoId": "dQw4w9WgXcQ","videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","transcript": [{"text": "We're no strangers to love","start": 0.5},{"text": "You know the rules and so do I","start": 2.8}]}
With All Timestamps and Metadata
{"success": true,"videoId": "dQw4w9WgXcQ","videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","transcript": [{"text": "We're no strangers to love","start": 0.5,"end": 2.8,"duration": 2.3,"offset": 0.5},{"text": "You know the rules and so do I","start": 2.8,"end": 4.9,"duration": 2.1,"offset": 2.8}],"metadata": {"videoId": "dQw4w9WgXcQ","videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","extractedAt": "2024-01-15T10:30:00.000Z","totalSegments": 156,"totalDuration": 212.5,"language": "en","preserveFormatting": false,"outputFormat": "json","includeTimestamps": true,"timestampFormat": "all","includeMetadata": true}}
Simplified Output (transcript array only)
[{"text": "We're no strangers to love","start": 0.5},{"text": "You know the rules and so do I","start": 2.8}]
Error Output
{"success": false,"videoId": "invalid123","videoUrl": "https://www.youtube.com/watch?v=invalid123","error": "No transcript found for this video","errorType": "NO_TRANSCRIPT_AVAILABLE","extractedAt": "2024-01-15T10:30:00.000Z"}
Usage Examples
Basic Usage
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}
Extract Specific Languages
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","languages": ["en", "es"],"outputFormat": "json"}
Generate SRT Subtitles
{"videoId": "dQw4w9WgXcQ","outputFormat": "srt","includeGenerated": true}
Timestamp Control Examples
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","includeTimestamps": false,"outputFormat": "json"}
Simplified Array Output
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","simplifiedOutput": true,"timestampFormat": "start_only"}
Full Information with Metadata
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","timestampFormat": "all","includeMetadata": true,"includeTimestamps": true}
Plain Text Output
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","outputFormat": "text","preserveFormatting": false}
Error Handling
The actor handles various error scenarios gracefully:
- TRANSCRIPTS_DISABLED: Video has transcripts disabled
- NO_TRANSCRIPT_AVAILABLE: No transcripts found for the video
- VIDEO_UNAVAILABLE: Video is private or unavailable
- AGE_RESTRICTED: Video is age-restricted
- NETWORK_ERROR: Network connectivity issues
- RATE_LIMITED: Too many requests (temporary)
- UNKNOWN_ERROR: Other unexpected errors
Local Development
Prerequisites
- Node.js 18 or higher
- npm or yarn
Setup
-
Clone the repository
-
Install dependencies:
$npm install -
Run tests:
$npm test -
Run the actor locally:
$npm start
Testing
The actor includes a test suite that validates core functionality:
$node src/test.js
API Usage
Using Apify API
const ApifyClient = require('apify-client');const client = new ApifyClient({token: 'YOUR_APIFY_TOKEN',});const input = {videoUrl: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',languages: ['en'],outputFormat: 'json',timestampFormat: 'start_only',includeMetadata: false};const run = await client.actor('jimbiano/extract-youtube-transcript').call(input);const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0]);
Using cURL
curl -X POST https://api.apify.com/v2/acts/jimbiano/extract-youtube-transcript/runs \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","outputFormat": "json","timestampFormat": "start_only","includeMetadata": false}'
Using Apify Console
- Visit: https://apify.com/jimbiano/extract-youtube-transcript
- Click "Try for free"
- Enter your input:
{"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}
- Click "Start" to run the actor
Limitations
- Depends on YouTube's transcript availability
- Some videos may not have transcripts
- Rate limiting may apply for high-volume usage
- Age-restricted videos may require authentication
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
License
MIT License - see LICENSE file for details
Support
For issues and questions:
- Create an issue on GitHub
- Contact support through Apify Console
- Check the Apify documentation
Changelog
v1.1.0
- ✅ NEW: Optional timestamp control (
includeTimestamps
,timestampFormat
) - ✅ NEW: Optional metadata inclusion (
includeMetadata
) - ✅ NEW: Simplified output option (
simplifiedOutput
) - ✅ NEW: Flexible timestamp formats:
start_only
(default),start_end
,all
- ✅ IMPROVED: More granular control over output structure
- ✅ IMPROVED: Better default settings for cleaner output
v1.0.0
- Initial release
- Basic transcript extraction
- Multi-language support
- Multiple output formats
- Error handling
On this page
Share Actor: