Youtube Transcript Scraper
Pricing
$9.00/month + usage
Youtube Transcript Scraper
β‘ Meet the Ultimate YouTube Transcript Hunter! β‘ This Apify Actor dives deep into YouTube π₯, extracts every word π§ , and even revives lost subtitles like a digital sorcerer πͺ. Fast. Smart. Unstoppable. Ready to fuel your next data breakthrough ππ€π₯
Pricing
$9.00/month + usage
Rating
0.0
(0)
Developer

Neuro Scraper
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 days ago
Last modified
Categories
Share
π YouTube Transcript Fetcher Actor
Instantly extract accurate YouTube transcripts β fast, secure, and production-ready.
π Summary
This actor automatically fetches YouTube video transcripts (including Shorts), returning clean, timestamped text data. It uses a dual-source strategy to ensure transcripts are delivered even when standard captions are unavailable.
Key benefits:
- β‘ Get transcripts instantly from multiple YouTube URLs.
- π Smart fallback ensures reliable results.
- π§ Normalizes Shorts and youtu.be links automatically.
- π Privacy-safe, proxy-compatible, and production-ready.
π‘ Use Cases
- π° Generate blog summaries or subtitles from videos.
- π Extract transcripts for research or educational analysis.
- π Analyze large-scale YouTube datasets for content insights.
- π§Ύ Auto-generate closed captions for your platform.
- π§ Power AI models or LLM pipelines with real spoken text.
β‘ Quick Start (Console β One Click)

- Open the Actor on Apify Console.
- Paste YouTube video URLs into the Input field.
- Click Run β results appear instantly in your Dataset.
βοΈ Quick Start (CLI + API)
CLI:
$apify call neuro-scraper/youtube-transcript-fetcher --input ./input.example.json
Python (apify-client):
from apify_client import ApifyClientclient = ApifyClient('<APIFY_TOKEN>')run = client.actor('neuro-scraper/youtube-transcript-fetcher').call(run_input={"startUrls": [{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}]})for item in client.dataset(run['defaultDatasetId']).list_items()['items']:print(item['Transcript']['plain_text'])
π Inputs
| π Name | π Type | β Required | βοΈ Default | π Example | π§ Notes |
|---|---|---|---|---|---|
| startUrls | array | β Yes | [] | [{"url": "https://www.youtube.com/watch?v=abcd1234"}] | List of YouTube video URLs |
| workers | integer | βοΈ Optional | 5 | 10 | Max concurrent fetches |
| proxyConfiguration | object | βοΈ Optional | {} | {"useApifyProxy": true} | Proxy settings if needed |
Example input (Console JSON):
{"startUrls": [{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"},{"url": "https://youtu.be/example123"}],"workers": 5,"proxyConfiguration": {"useApifyProxy": true}}
π Outputs
Each item in the Dataset contains:
{"video_id": "AWBsoArakNY","title": "Who Has The Fastest Reaction Time?","url": "https://youtube.com/AWBsoArakNY?si=8ThAJzdEEA1PnZRk","lang": "en","format": "vtt","segments": [{"start": 3.919,"end": 5.99,"text": "I [screaming] WON. I WON. OH, I SHOULD","duration": 2.071,"duration_seconds": 2,"duration_milliseconds": 71,"duration_seconds_with_ms": "2.071","duration_minutes": 0.03451666666666667,"start_ts": "00:00:03.919","end_ts": "00:00:05.990","display": "Transcripts:\nStart: 00:00:03.919 End: 00:00:05.990\nDuration: 2.071 seconds (0.034517 minutes)\n\nNo transcript added for this duration."}]}
Results are stored in the default Dataset for easy export (JSON, CSV, Excel).
π Environment Variables
| Variable | Description |
|---|---|
APIFY_TOKEN | Required for authentication |
HTTP_PROXY, HTTPS_PROXY | Optional custom proxies |
APIFY_PROXY_PASSWORD | Use with Apify Proxy |
Store all credentials securely as secrets, not plaintext.
βΆοΈ How to Run
- Open Apify Console.
- Navigate to Actors β YouTube Transcript Fetcher.
- Paste input JSON or fill the input form.
- Click Run.
- View results in the Dataset tab.
β° Scheduling & Webhooks
- Schedule periodic runs (e.g., daily or hourly) from the Schedule tab.
- Configure Webhooks to trigger a custom workflow or send notifications on completion.
πΎ Logs & Troubleshooting
-
Monitor real-time logs in the Console Run Log panel.
-
Common issues:
- β No transcript available: Video may lack captions.
- β οΈ Timeout errors: Increase
workersor adjust proxy settings.
π Permissions & Storage
- Uses Dataset for storing transcript results.
- Uses RequestQueue internally for managing URL processing.
- Fully privacy-safe: no personal data stored or shared.
π Changelog
| Version | Date | Notes |
|---|---|---|
| 1.0.0 | 2025-11-04 | Initial release β stable and production-ready |
π Notes / TODOs
- TODO: Confirm output schema for advanced use-cases.
- TODO: Add demo GIF of console run for better UX.
π Proxy Configuration
Enable Apify Proxy directly in the Console for easy network routing.
Custom proxy example:
{"proxyConfiguration": {"proxyUrls": ["http://<PROXY_USER:PASS@HOST:PORT>"]}}
Or use environment variables:
export HTTP_PROXY=http://<PROXY_USER:PASS@HOST:PORT>export HTTPS_PROXY=http://<PROXY_USER:PASS@HOST:PORT>
Best practice: Store proxy credentials as secrets.
TODO: Consider proxy rotation for large-scale scraping.
π References
π€ Inferred from main.py
- Fetches data from external YouTube transcript APIs.
- Supports fallback transcript extraction.
- Uses proxy handling and retry logic for stability.
- Exports formatted text with timestamps.
β Why this Actor
YouTube Transcript Fetcher is built for professionals who need transcripts fast, reliably, and at scale β ideal for analysts, educators, and developers.
Run this Actor on Apify Console β get instant transcripts in seconds.