Youtube Transcript Scraper avatar
Youtube Transcript Scraper

Pricing

Pay per usage

Go to Apify Store
Youtube Transcript Scraper

Youtube Transcript Scraper

Developed by

Scraper Engine

Scraper Engine

Maintained by Community

0.0 (0)

Pricing

Pay per usage

0

2

2

Last modified

18 hours ago

Extract transcripts, video details, and metadata from YouTube videos with high accuracy and reliability. This Apify actor supports bulk processing, multiple languages, and intelligent proxy management.

Why Choose Us?

  • πŸ” High Accuracy: Uses yt-dlp for reliable video data extraction
  • ⚑ Fast Processing: Parallel processing with configurable workers
  • 🌍 Multi-Language: Support for transcripts in multiple languages
  • πŸ›‘οΈ Anti-Blocking: Intelligent proxy management with automatic fallback
  • πŸ“Š Rich Metadata: Extract comprehensive video and channel information
  • πŸ”„ Retry Logic: Robust error handling with configurable retries
  • πŸ’Ύ Live Saving: Results saved immediately to prevent data loss

Key Features

  • Bulk Processing: Process multiple YouTube URLs simultaneously
  • Language Selection: Choose preferred transcript language
  • Proxy Support: Built-in proxy management with residential fallback
  • Rich Data: Extract transcripts, video details, channel info, and metadata
  • Error Handling: Comprehensive error handling and logging
  • Rate Limiting: Configurable delays to avoid rate limiting
  • Parallel Execution: Multi-threaded processing for faster results

Input

Input Schema

{
"startUrls": [
{
"url": "https://www.youtube.com/watch?v=Z4hVGCWH1Kc"
},
{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}
],
"language": "en",
"maxWorkers": 4,
"requestDelay": 1,
"maxRetries": 3,
"proxyConfiguration": {
"useApifyProxy": false
}
}

Input Parameters

ParameterTypeRequiredDefaultDescription
startUrlsArrayβœ… Yes-List of YouTube video URLs to process
languageString❌ No"en"Preferred language code for transcripts
maxWorkersInteger❌ No4Number of parallel workers (1-10)
requestDelayInteger❌ No1Base delay between requests in seconds
maxRetriesInteger❌ No3Number of retry attempts for failed requests
proxyConfigurationObject❌ No{"useApifyProxy": false}Proxy configuration settings

Output

Output Schema

{
"url": "https://www.youtube.com/watch?v=Z4hVGCWH1Kc",
"title": "Example Video Title",
"length": "10:30",
"channel_name": "Example Channel",
"views": 1000000,
"transcript": [
{
"start": "0.000",
"dur": "2.500",
"text": "Hello and welcome to this video."
},
{
"start": "2.500",
"dur": "3.200",
"text": "Today we'll be discussing..."
}
],
"message": "Transcript loaded from subtitles:en",
"success": true,
"video_id": "Z4hVGCWH1Kc",
"channel_url": "https://www.youtube.com/channel/UC...",
"upload_date": "20231201",
"description": "Video description text...",
"tags": ["tag1", "tag2", "tag3"],
"category": "Education",
"like_count": 5000,
"dislike_count": 100,
"comment_count": 250
}

Output Fields

FieldTypeDescription
urlStringOriginal YouTube video URL
titleStringVideo title
lengthStringVideo duration in MM:SS format
channel_nameStringChannel name
viewsNumberView count
transcriptArrayArray of transcript segments with timing
messageStringStatus message about transcript extraction
successBooleanWhether processing was successful
video_idStringYouTube video ID
channel_urlStringChannel URL
upload_dateStringUpload date (YYYYMMDD format)
descriptionStringVideo description
tagsArrayVideo tags
categoryStringVideo category
like_countNumberLike count
dislike_countNumberDislike count
comment_countNumberComment count

πŸš€ How to Use the Actor (via Apify Console)

  1. Log in at https://console.apify.com and go to Actors
  2. Find the youtube-transcript-scraper actor and click it
  3. Configure inputs:
    • Add YouTube video URLs in the startUrls field
    • Set preferred language (optional)
    • Configure proxy settings if needed
    • Adjust processing parameters
  4. Run the actor and monitor logs in real time
  5. Access results in the OUTPUT tab
  6. Export results to JSON or CSV format

Best Use Cases

  • πŸ“š Educational Content: Extract transcripts from educational videos for study materials
  • 🎬 Content Analysis: Analyze video content and speech patterns
  • 🌐 Translation: Extract transcripts for translation services
  • πŸ“ Documentation: Create written records of video content
  • πŸ” Research: Analyze video content for research purposes
  • πŸ“Š SEO: Extract keywords and content for SEO analysis
  • 🎯 Accessibility: Create accessible content from video transcripts

Frequently Asked Questions

Q: What types of YouTube URLs are supported?

A: The actor supports standard YouTube watch URLs (youtube.com/watch?v=...) and shortened URLs (youtu.be/...).

Q: Can I extract transcripts in different languages?

A: Yes! Set the language parameter to your preferred language code (e.g., "es" for Spanish, "fr" for French). The actor will try to find transcripts in that language first, then fall back to available alternatives.

Q: What happens if a video has no transcript?

A: The actor will return a result with an empty transcript array and a message indicating that no transcript is available for that video.

Q: How does the proxy fallback work?

A: If you enable proxy and YouTube blocks the initial proxy, the actor automatically switches to a residential proxy and continues processing all remaining URLs with that proxy.

Q: Can I process private or age-restricted videos?

A: No, the actor can only process publicly available videos that don't require authentication.

Q: What's the maximum number of URLs I can process?

A: There's no strict limit, but we recommend processing up to 100 URLs per run for optimal performance and to avoid rate limiting.

Q: How accurate are the transcripts?

A: The actor extracts official YouTube subtitles and auto-generated captions. Accuracy depends on the quality of the original subtitles provided by the video creator.

Q: Can I get timestamps with the transcript?

A: Yes! Each transcript segment includes start time and duration information for precise timing.

Support and Feedback

Cautions

⚠️ Important Notes:

  • Data is collected only from publicly available YouTube videos
  • No data is extracted from private, unlisted, or age-restricted content
  • The end user is responsible for ensuring legal compliance
  • Respect YouTube's Terms of Service and rate limits
  • Use transcripts responsibly and respect copyright laws
  • This tool is for educational and research purposes only

Built with ❀️ by the Apify team