Youtube Shorts Transcript Scraper avatar
Youtube Shorts Transcript Scraper

Pricing

$5.99/month + usage

Go to Apify Store
Youtube Shorts Transcript Scraper

Youtube Shorts Transcript Scraper

πŸš€ Turn YouTube Shorts into clean, timed transcripts ⏱️. Get precise timestamps, durations, and readable output πŸ“„. Bulk processing, proxy support for stability 🌐, and structured data ready for analysis, SEO, or content reuse ✨.

Pricing

$5.99/month + usage

Rating

0.0

(0)

Developer

Neuro Scraper

Neuro Scraper

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

🌟 YouTube Shorts Transcript Fetcher (Shorts-only)

Build Version License

Instant, reliable transcripts for YouTube Shorts β€” optimized for short-form content.


πŸ“– Overview

This Actor fetches clean, timestamped transcripts specifically from YouTube Shorts (including youtu.be short links and /shorts URLs). It is optimized for short durations, fast responses, and batch processing of multiple Shorts.


πŸ’‘ Why Shorts-only?

  • Shorts are brief β€” timestamps and compact formatting matter.
  • Many Shorts use alternative URL forms (youtu.be and /shorts) β€” this Actor normalizes them.
  • Faster runs and smaller payloads reduce cost when processing thousands of short clips.

πŸ”§ Key Features

  • βœ… Short-specific URL normalization (/shorts, youtu.be).
  • βœ… Dual-source transcript fetching with smart fallback.
  • βœ… Compact timestamped output tailored for clips under 60 seconds.
  • βœ… Batch input support and configurable concurrency.
  • βœ… Proxy-compatible and production-ready.

⚑ Quick Start β€” Console

  1. Open the Actor on Apify Console.
  2. Paste one or more Shorts URLs into the input (regular, /shorts/ or youtu.be links accepted).
  3. Click Run β€” view transcript items in the Dataset.

βš™οΈ Quick Start β€” CLI & Python

CLI

$apify call neuro-scraper/youtube-shorts-transcript --input ./shorts_input.json

Python (apify-client)

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.actor('neuro-scraper/youtube-shorts-transcript').call(
run_input={"startUrls": [{"url": "https://www.youtube.com/shorts/EXAMPLE"}]}
)
for item in client.dataset(run['defaultDatasetId']).list_items()['items']:
print(item['Transcript']['plain_text'])

πŸ“ Inputs (Shorts-focused)

NameTypeRequiredDefaultExampleNotes
startUrlsarrayYes[][{"url":"https://www.youtube.com/shorts/abcd1234"}]List of Shorts URLs
workersintegerOptional510Max concurrent fetches
proxyConfigurationobjectOptional{}{"useApifyProxy": true}Proxy settings

Example input (Console JSON):

{
"startUrls": [
{"url": "https://www.youtube.com/shorts/abcd1234"},
{"url": "https://youtu.be/abcd1234"}
],
"workers": 5,
"proxyConfiguration": {"useApifyProxy": true}
}

πŸ“„ Outputs (compact)

Each Dataset item returns a compact schema tuned for Shorts:

{
"video_id": "AWBsoArakNY",
"title": "Who Has The Fastest Reaction Time?",
"url": "https://youtube.com/shorts/AWBsoArakNY?si=8ThAJzdEEA1PnZRk",
"lang": "en",
"format": "vtt",
"segments": [
{
"start": 3.919,
"end": 5.99,
"text": "I [screaming] WON. I WON. OH, I SHOULD",
"duration": 2.071,
"duration_seconds": 2,
"duration_milliseconds": 71,
"duration_seconds_with_ms": "2.071",
"duration_minutes": 0.03451666666666667,
"start_ts": "00:00:03.919",
"end_ts": "00:00:05.990",
"display": "Transcripts:\nStart: 00:00:03.919 End: 00:00:05.990\nDuration: 2.071 seconds (0.034517 minutes)\n\nNo transcript added for this duration."
}
]
}

Outputs are easy to export (JSON, CSV, Excel) and optimized to contain only essential text and timestamps.


πŸ”‘ Environment Variables

  • APIFY_TOKEN β€” required.
  • HTTP_PROXY, HTTPS_PROXY β€” optional.
  • APIFY_PROXY_PASSWORD β€” use when using Apify Proxy.

Store credentials securely as secrets, not plaintext.


▢️ How to Run (short checklist)

  1. Open Apify Console β†’ Actors β†’ YouTube Shorts Transcript Fetcher.
  2. Paste Shorts-only input JSON or use CLI/Python.
  3. Run and inspect Dataset.

πŸ›  Logs & Troubleshooting

  • No transcript available β€” the Short may not have captions or the audio is muted.
  • Timeouts / failures β€” try reducing concurrency (workers) or enable proxy.

Monitor real-time logs in the Console Run Log panel.


⏱ Scheduling & Webhooks

  • Schedule frequent short-run batches (e.g., every 15 minutes or hourly) for near-real-time ingestion.
  • Use Webhooks to push results to downstream systems after each run.

πŸ”Ÿ Changelog

  • 1.0.0 β€” 2025-11-04: Initial Shorts-only release.

πŸ“ Notes & TODO

  • TODO: Add demo GIF showing Console run for Shorts.
  • TODO: Add language detection flag for multilingual Shorts.

βœ… Final note

This README is tailored for teams that process high volumes of YouTube Shorts and need compact, timestamped transcripts quickly and reliably.