Youtube Transcript Scraper
Pricing
Pay per usage
Youtube Transcript Scraper
0.0 (0)
Pricing
Pay per usage
0
2
2
Last modified
18 hours ago
Extract transcripts, video details, and metadata from YouTube videos with high accuracy and reliability. This Apify actor supports bulk processing, multiple languages, and intelligent proxy management.
Why Choose Us?
- π High Accuracy: Uses yt-dlp for reliable video data extraction
- β‘ Fast Processing: Parallel processing with configurable workers
- π Multi-Language: Support for transcripts in multiple languages
- π‘οΈ Anti-Blocking: Intelligent proxy management with automatic fallback
- π Rich Metadata: Extract comprehensive video and channel information
- π Retry Logic: Robust error handling with configurable retries
- πΎ Live Saving: Results saved immediately to prevent data loss
Key Features
- Bulk Processing: Process multiple YouTube URLs simultaneously
- Language Selection: Choose preferred transcript language
- Proxy Support: Built-in proxy management with residential fallback
- Rich Data: Extract transcripts, video details, channel info, and metadata
- Error Handling: Comprehensive error handling and logging
- Rate Limiting: Configurable delays to avoid rate limiting
- Parallel Execution: Multi-threaded processing for faster results
Input
Input Schema
{"startUrls": [{"url": "https://www.youtube.com/watch?v=Z4hVGCWH1Kc"},{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],"language": "en","maxWorkers": 4,"requestDelay": 1,"maxRetries": 3,"proxyConfiguration": {"useApifyProxy": false}}
Input Parameters
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
startUrls | Array | β Yes | - | List of YouTube video URLs to process |
language | String | β No | "en" | Preferred language code for transcripts |
maxWorkers | Integer | β No | 4 | Number of parallel workers (1-10) |
requestDelay | Integer | β No | 1 | Base delay between requests in seconds |
maxRetries | Integer | β No | 3 | Number of retry attempts for failed requests |
proxyConfiguration | Object | β No | {"useApifyProxy": false} | Proxy configuration settings |
Output
Output Schema
{"url": "https://www.youtube.com/watch?v=Z4hVGCWH1Kc","title": "Example Video Title","length": "10:30","channel_name": "Example Channel","views": 1000000,"transcript": [{"start": "0.000","dur": "2.500","text": "Hello and welcome to this video."},{"start": "2.500","dur": "3.200","text": "Today we'll be discussing..."}],"message": "Transcript loaded from subtitles:en","success": true,"video_id": "Z4hVGCWH1Kc","channel_url": "https://www.youtube.com/channel/UC...","upload_date": "20231201","description": "Video description text...","tags": ["tag1", "tag2", "tag3"],"category": "Education","like_count": 5000,"dislike_count": 100,"comment_count": 250}
Output Fields
Field | Type | Description |
---|---|---|
url | String | Original YouTube video URL |
title | String | Video title |
length | String | Video duration in MM:SS format |
channel_name | String | Channel name |
views | Number | View count |
transcript | Array | Array of transcript segments with timing |
message | String | Status message about transcript extraction |
success | Boolean | Whether processing was successful |
video_id | String | YouTube video ID |
channel_url | String | Channel URL |
upload_date | String | Upload date (YYYYMMDD format) |
description | String | Video description |
tags | Array | Video tags |
category | String | Video category |
like_count | Number | Like count |
dislike_count | Number | Dislike count |
comment_count | Number | Comment count |
π How to Use the Actor (via Apify Console)
- Log in at https://console.apify.com and go to Actors
- Find the
youtube-transcript-scraper
actor and click it - Configure inputs:
- Add YouTube video URLs in the
startUrls
field - Set preferred language (optional)
- Configure proxy settings if needed
- Adjust processing parameters
- Add YouTube video URLs in the
- Run the actor and monitor logs in real time
- Access results in the OUTPUT tab
- Export results to JSON or CSV format
Best Use Cases
- π Educational Content: Extract transcripts from educational videos for study materials
- π¬ Content Analysis: Analyze video content and speech patterns
- π Translation: Extract transcripts for translation services
- π Documentation: Create written records of video content
- π Research: Analyze video content for research purposes
- π SEO: Extract keywords and content for SEO analysis
- π― Accessibility: Create accessible content from video transcripts
Frequently Asked Questions
Q: What types of YouTube URLs are supported?
A: The actor supports standard YouTube watch URLs (youtube.com/watch?v=...
) and shortened URLs (youtu.be/...
).
Q: Can I extract transcripts in different languages?
A: Yes! Set the language
parameter to your preferred language code (e.g., "es" for Spanish, "fr" for French). The actor will try to find transcripts in that language first, then fall back to available alternatives.
Q: What happens if a video has no transcript?
A: The actor will return a result with an empty transcript array and a message indicating that no transcript is available for that video.
Q: How does the proxy fallback work?
A: If you enable proxy and YouTube blocks the initial proxy, the actor automatically switches to a residential proxy and continues processing all remaining URLs with that proxy.
Q: Can I process private or age-restricted videos?
A: No, the actor can only process publicly available videos that don't require authentication.
Q: What's the maximum number of URLs I can process?
A: There's no strict limit, but we recommend processing up to 100 URLs per run for optimal performance and to avoid rate limiting.
Q: How accurate are the transcripts?
A: The actor extracts official YouTube subtitles and auto-generated captions. Accuracy depends on the quality of the original subtitles provided by the video creator.
Q: Can I get timestamps with the transcript?
A: Yes! Each transcript segment includes start time and duration information for precise timing.
Support and Feedback
- π§ Email: support@apify.com
- π¬ Chat: Available in the Apify Console
- π Documentation: https://docs.apify.com
- π Bug Reports: Report issues through the Apify Console
Cautions
β οΈ Important Notes:
- Data is collected only from publicly available YouTube videos
- No data is extracted from private, unlisted, or age-restricted content
- The end user is responsible for ensuring legal compliance
- Respect YouTube's Terms of Service and rate limits
- Use transcripts responsibly and respect copyright laws
- This tool is for educational and research purposes only
Built with β€οΈ by the Apify team
On this page
-
-
- Q: What types of YouTube URLs are supported?
- Q: Can I extract transcripts in different languages?
- Q: What happens if a video has no transcript?
- Q: How does the proxy fallback work?
- Q: Can I process private or age-restricted videos?
- Q: What's the maximum number of URLs I can process?
- Q: How accurate are the transcripts?
- Q: Can I get timestamps with the transcript?
Share Actor: