TED Talk Transcript Scraper — TXT, SRT & VTT (No Login)
Pricing
from $1.00 / 1,000 per record returneds
TED Talk Transcript Scraper — TXT, SRT & VTT (No Login)
Extract any TED Talk's transcript via TED's own public API — no login, no ASR. Full text, timestamped segments & SRT/VTT in any available language, plus speaker, views, topics and TED's AI takeaway. Point it at talk URLs or a topic/speaker page. $2 per 1,000 talks.
Pricing
from $1.00 / 1,000 per record returneds
Rating
0.0
(0)
Developer
Scrapers Delight
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
🎤 TED Talk Transcript Scraper — TXT, SRT, VTT
Get any TED Talk's transcript instantly — no login, no AI transcription. TED publishes a transcript for every talk in dozens of languages, and this actor reads it straight from TED's own API: full text, timestamped segments, and ready-to-use SRT/VTT — plus the speaker, view count, topics, and TED's AI takeaway. Point it at talk URLs or a whole TED topic/speaker page.
Because the transcript already exists, there's no speech-to-text compute — it's fast and cheap.
What does it do?
For each TED talk you give it (by URL or harvested from a TED page), it returns:
- 📝 Full transcript (plain text) — always included
- ⏲️ Timestamped segments —
{start, end, text} - 🎬 SRT / VTT subtitles — ready for any editor
- 🎤 Speaker, duration, view count, recorded/published dates
- 🏷️ Topics + 💡 TED's AI takeaway headline
- 🌍 Any available language
No ASR, no API key — it reads TED's published transcript.
What data does it extract?
For every talk:
- 🆔
talk_id,slug, 🔗url, 🏷️title - 🎤
speaker, ⏱️duration_sec, 👁️views, 📅recorded_on,published_at - 🏷️
topics[], 💡takeaway_headline, 📝description - 🌍
language, 📄transcript, ⏲️segments[], 🎬srt,vtt,segment_count - ✨
is_new(monitor), 🕒scraped_at
Who is it for?
- ✍️ Writers & content teams repurposing talks into articles, quotes, and summaries.
- 🤖 AI / RAG dataset builders assembling clean, multilingual speech text.
- 🔎 Researchers & educators searching talk content and citing passages.
- 🌍 Localization teams pulling transcripts across languages.
How to use it (step by step)
- Click Try for free.
- Paste one or more talk URLs (e.g.
https://www.ted.com/talks/{slug}) — or a TED topic/speaker page URL. - (Optional) set a language and extra formats (
srt,vtt,segments). - Click Start, then open the Dataset tab to view/export.
- (Optional) set monitorMode + a pageUrl + a Schedule to capture new talks automatically.
Quick start
{ "talkUrls": ["https://www.ted.com/talks/bill_gates_the_next_outbreak_we_re_not_ready"], "transcriptFormats": ["txt", "srt"] }
Input
| Field | What it does |
|---|---|
talkUrls | TED talk URLs / slugs |
pageUrl | a TED topic/speaker/playlist page to harvest talk links from |
language | transcript language code (default en) |
transcriptFormats | txt · segments · srt · vtt |
includeTakeaways | add topics, description, and TED's AI takeaway |
maxTalks | hard cap per run (0 = unlimited) |
monitorMode, alertOnNewTalk | recurring new-talk watcher + alerts |
webhookUrl, slackWebhookUrl, emailRecipients | alert channels |
proxyConfiguration, requestConcurrency | proxy + parallelism |
Output
Each talk is one dataset record (fields above). Export to JSON, CSV, Excel, HTML, or RSS, or fetch via the Apify API.
How much does it cost?
Pay-per-event — and with no transcription compute, it's cheap:
| Event | What it covers | Suggested price |
|---|---|---|
lot-scraped | each talk returned | ~$0.003 / talk |
lot-detail-enriched | each transcript fetched | ~$0.003 / talk |
monitor-run-completed | each scheduled watch run | ~$0.05 / run |
new-lot-detected | each new talk | ~$0.02 / talk |
alert-delivered | each Slack/email/webhook push | ~$0.005 / alert |
(Final per-event prices are set on the actor's pricing page.)
How does it work without AI transcription?
TED publishes a human/edited transcript for each talk, in many languages, and exposes it through a public API. This actor reads that existing transcript — it does not run speech-to-text, so there's no GPU/compute cost and results are instant.
Is it legal to scrape TED transcripts?
TED talks and their transcripts are published publicly, and TED talks are generally released under a Creative Commons (BY–NC–ND) license. The output is talk content and public stats, not personal data. Scraping public data is generally legal, but you are responsible for your use — review TED's Terms of Service and the talks' Creative Commons license, and attribute/limit redistribution accordingly.
FAQ
Which languages?
Whatever TED offers for the talk (often dozens). Set language; talks without that language are flagged.
Is there a Whisper/ASR step? No — it reads TED's own transcript, so it's fast and cheap.
Can I get subtitles?
Yes — add srt and/or vtt to transcriptFormats.
Can I grab a whole topic or speaker's talks?
Yes — set pageUrl to a TED topic/speaker page and the actor harvests the talk links. Add monitorMode to catch new talks.
How do I export? JSON, CSV, Excel, HTML, or RSS from the Dataset tab, or via the Apify API.
Feedback
Want full-playlist crawling, speaker bios, or another language default? Open an issue on the actor.