Yt Ai Summarizer
Pricing
Pay per event
Yt Ai Summarizer
Turn any public YouTube video URL into multilingual transcript, subtitles, timing data, and AI summary for faster content reuse and analysis. Automatic language detection with clean segmentation and precise timings generated in one run: plain transcript, SRT, VTT, TSV
Pricing
Pay per event
Rating
0.0
(0)
Developer

Krzysztof
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
YouTube AI Transcriber Actor
What does YouTube AI Transcriber do?
YouTube AI Transcriber automatically turns any public YouTube video URL into clean transcripts and subtitle files. Provide a single video URL as input and the Actor handles audio retrieval, speech recognition, language detection, segmentation and AI summary generation for you on the Apify platform.
Use it to quickly transcribe long videos, extract study notes, get quotable lines, prepare timed captions, or obtain an instant AI-generated summary—without manual effort.
Why use this Actor?
Long-form video is time-consuming to consume. This Actor lets you convert video speech into structured text you can filter, search, and reuse. It accelerates:
- Content research & editorial workflows
- Lesson and lecture preparation
- Journalism & fact extraction
- Study note creation & revision aids
- Competitive analysis & market scanning
- Accessibility (captions / alternative formats)
Running on Apify gives you scheduling, monitoring, API access, proxy management, webhooks, integrations, and scalable storage—so you can automate batch processing, trigger downstream pipelines, and export results in multiple formats reliably.
Key Features
- AI-Powered Summaries (optional): Generate intelligent summaries of transcripts using advanced AI models.
- Automatic language detection (no need to specify language; override supported if desired).
- Clean segmentation with precise start/end timings for each text block.
- Multiple export formats generated in one run: plain text, subtitles (SRT / VTT), tabular timing data, structured JSON.
- Dataset + Key‑Value Store outputs for easy API retrieval and automation.
- Designed for single-URL simplicity—just paste a video link and run.
- Built-in resilience & retry logic to improve success on slower connections.
- Leverages Apify platform advantages: scheduling, webhooks, dataset exports (JSON, CSV, Excel), API, and proxy rotation.
Example results
You can browse sample run results here:
summaryUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/summary.txttranscriptJsonUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/transcript.jsontranscriptTxtUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/transcript.txttranscriptSrtUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/transcript.srttranscriptVttUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/transcript.vtttranscriptTsvUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/transcript.tsvtranscriptJsonCompactUrl: https://api.apify.com/v2/key-value-stores/YlZlBDpzfNNXGrv5D/records/transcript.writer.json
What data can this Actor extract?
The Actor produces:
- A dataset record (one item per run / video) containing core metadata and convenience fields.
- Multiple transcript artifacts stored in the default Key‑Value Store (structured JSON, writer JSON, TXT, SRT, VTT, TSV, optional summary).
Dataset Record / Output Fields (canonical)
| Field | Description | Example |
|---|---|---|
| url | Original YouTube video URL | https://www.youtube.com/watch?v=CiREU-RAlf8 |
| videoId | YouTube video ID | CiREU-RAlf8 |
| title | Video title | How Does Ageing Impact Muscle Growth and Strength? Ft. Lyle McDonald |
| durationSeconds | Video duration in seconds | 503.037125 |
| language | Detected language code | en |
| segmentCount | Number of transcript segments | 76 |
| generated_formats | List of generated transcript format short codes | ["txt","vtt","srt","tsv","json"] |
| transcriptKey | KV store key for structured transcript JSON | transcript.json |
| resultsDataset | Dataset ID containing transcript segments | 9hP3aeAjoamHrYxFa |
| resultsDatasetUrl | API URL to fetch dataset items as JSON | https://api.apify.com/v2/datasets/<id>/items?format=json |
| transcriptJsonUrl | Structured transcript (transcript.json) | .../transcript.json |
| transcriptJsonCompactUrl | Compact writer JSON (transcript.writer.json) | .../transcript.writer.json |
| transcriptTxtUrl | Plain text transcript (transcript.txt) | .../transcript.txt |
| transcriptSrtUrl | SubRip subtitles (transcript.srt) | .../transcript.srt |
| transcriptVttUrl | WebVTT subtitles (transcript.vtt) | .../transcript.vtt |
| transcriptTsvUrl | Tab-separated timings (transcript.tsv) | .../transcript.tsv |
| summaryUrl | AI-generated summary (summary.txt) | .../summary.txt |
| formatUrls | Map of format code -> direct URL (includes structured, json, summary) | { "txt": "...", "structured": "..." } |
Structured transcript (transcript.json) contains full segment objects (start, end, text). Writer JSON (transcript.writer.json) is a compact variant optimized for certain downstream tooling. Subtitles (SRT/VTT) contain timed caption blocks; TSV contains per-segment timings (start/end ms) and text.
Output Example (Actor Output Object)
{"videoId": "CiREU-RAlf8","language": "en","segmentCount": 76,"durationSeconds": 503.037125,"transcriptKey": "transcript.json","resultsDataset": "9hP3aeAjoamHrYxFa","resultsDatasetUrl": "https://api.apify.com/v2/datasets/9hP3aeAjoamHrYxFa/items?format=json","transcriptJsonUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.json","transcriptTxtUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.txt","transcriptSrtUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.srt","transcriptVttUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.vtt","transcriptTsvUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.tsv","transcriptJsonCompactUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.writer.json","summaryUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/summary.txt","formatUrls": {"txt": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.txt","vtt": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.vtt","srt": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.srt","tsv": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.tsv","json": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.writer.json","structured": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.json","summary": "https://api.apify.com/v2/key-value-stores/<storeId>/records/summary.txt"}}
Sample Dataset Record
{"url": "https://www.youtube.com/watch?v=CiREU-RAlf8","videoId": "CiREU-RAlf8","title": "How Does Ageing Impact Muscle Growth and Strength? Ft. Lyle McDonald","durationSeconds": 503.037125,"language": "en","segmentCount": 76,"generated_formats": ["txt","vtt","srt","tsv","json"],"transcriptKey": "transcript.json","resultsDataset": "9hP3aeAjoamHrYxFa","resultsDatasetUrl": "https://api.apify.com/v2/datasets/9hP3aeAjoamHrYxFa/items?format=json","transcriptJsonUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.json","transcriptTxtUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.txt","transcriptSrtUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.srt","transcriptVttUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.vtt","transcriptTsvUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.tsv","transcriptJsonCompactUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.writer.json","summaryUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/summary.txt","formatUrls": {"txt": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.txt","vtt": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.vtt","srt": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.srt","tsv": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.tsv","json": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.writer.json","structured": "https://api.apify.com/v2/key-value-stores/<storeId>/records/transcript.json","summary": "https://api.apify.com/v2/key-value-stores/<storeId>/records/summary.txt"}}
Pricing & Run Expectations
This Actor is optimized to be predictable and transparent.
If your Apify account uses Pay‑Per‑Event pricing the following usage events are charged:
| Event name | When it happens | How the count is calculated |
|---|---|---|
| run-start | As soon as a run begins (after input validation) | Always 1 per run |
| transcription-minute | After successful transcription | Ceil(video duration seconds / 60) |
| summary-token | After AI summary generation (only if summary enabled) | Total AI model tokens used for the summary |
Practical examples
| Video length | Minutes charged | Summary enabled? | Total Cost |
|---|---|---|---|
| 62 seconds | 2 | No | $0.001 + $0.05 = $0.051 |
| 8m 23s | 9 | Yes | $0.001 + $0.45 + $0.114 = $0.565 |
| 42m 00s | 42 | No | $0.001 + $2.1 = $2.101 |
Tips & Best Practices
- Provide clean, single video URLs (avoid playlists unless specifying a single item link).
- If language detection occasionally misclassifies, supply the explicit language code.
- Use dataset exports for bulk text analytics (e.g., sentiment, topic clustering) downstream.
- Subtitle files (SRT/VTT) can be imported directly into most video editors or LMS platforms.
- Schedule runs for recurring channels once playlist/batch mode becomes available (see future improvements).
Future Improvements
Planned enhancements (excluding already delivered AI summaries):
- Playlist / batch input (process multiple URLs in one run)
- Speaker diarization (label speakers in transcripts)
- Sentiment & topic tagging enrichment
- Configurable output format selection (generate only needed files)
- More advanced summary styles (bullets, Q&A, multi-length abstraction)
FAQ
Is it legal to transcribe YouTube videos?
Public videos can generally be processed for personal or analytical use. Respect copyright and platform terms. Avoid redistributing transcripts if not permitted.
Can I process private or region-locked videos?
Only publicly accessible URLs that play from the environment can be handled.
Do I need to specify the language?
No—automatic detection runs by default. Provide a language code if consistency is critical.
Are AI summaries supported now?
Yes.
Why is my runtime slower on very long videos?
Runtime scales with total audio duration. Network variability and video encoding also affect speed.
Can I integrate this with other tools?
Yes—use the Apify API, webhooks, or Zapier/Make integrations to trigger downstream processing.
Does it summarize live streams?
Currently focused on completed videos. Live/ongoing stream support is not yet available.
Disclaimer & Ethical Use
Results may include personal statements spoken by creators. Avoid extracting or publishing sensitive information without consent. Ensure usage aligns with local regulations and YouTube Terms of Service. If in doubt, consult legal counsel.
Our solution extracts only publicly available speech content. Personal data protections (e.g. GDPR) may still apply; evaluate lawful basis before large-scale processing.
Support & Feedback
Encounter an issue or need a custom enhancement (e.g., AI summaries, batch playlists)? Open an issue, leave a comment on the Actor page, or reach out via Apify support. Feedback helps prioritize future improvements.
Use Cases Recap
- Rapid video content triage for editors & journalists
- Study aids and note generation for educational channels
- Caption preparation and adaptation
- Research aggregation across thematic videos
- Accessibility enhancement (text alternatives)
Leverage Apify scheduling + API to build automated video intelligence pipelines with transcript search, preview summaries, and downstream analytics.