TED Talks Scraper
Pricing
from $3.00 / 1,000 results
TED Talks Scraper
Scrape TED.com talks with title, speaker, duration, view count, publish/record dates, topics, language, description, thumbnail. Two modes: fetch specific talks by URL/slug, or browse all talks in a topic. Pure HTTP, no auth needed.
Pricing
from $3.00 / 1,000 results
Rating
5.0
(21)
Developer
Crawler Bros
Maintained by CommunityActor stats
22
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
Scrape TED.com talks — title, speaker, duration, view count, publish/record dates, topics, language, description, and thumbnail. Two modes: fetch specific talks by URL/slug, or browse all talks in a topic.
Pure HTTP, no auth, no proxy required. TED.com works from datacenter IPs.
What you get
Talk records (recordType=talk)
| Field | Description |
|---|---|
id | TED talk ID |
slug | URL slug (e.g. sir_ken_robinson_do_schools_kill_creativity) |
url | Canonical talk URL |
title | Talk title |
speaker | Presenter name (presenterDisplayName) |
partnerName | TEDx / TED Conference / Independent / etc. |
description | Plain-text summary |
socialDescription | Optional alt summary used on social embeds (only when different) |
durationSeconds | Talk length in seconds |
durationFormatted | Human-readable MM:SS or H:MM:SS |
viewedCount | All-time view count |
publishedAt | ISO 8601 timestamp the talk was published on TED.com |
recordedOn | ISO date the talk was recorded |
language | ISO 639-1 language code |
featured | true for featured talks |
curatorApproved | true for curator-approved talks |
hasTranslations | true if subtitle translations exist |
topics | Array of topic names (e.g. creativity, business, psychology) |
thumbnailUrl | Widescreen thumbnail URL |
relatedTalkSlugs | Array of related talk slugs |
scrapedAt | ISO 8601 UTC timestamp |
Empty fields are dropped from every record at every depth.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
mode | Enum | byUrls | byUrls / byTopic / bySearch / bySpeaker / byPlaylist / browse |
talkUrls | Array | ["sir_ken_robinson_do_schools_kill_creativity"] | Talk URLs or slugs (mode=byUrls) |
topic | Enum | creativity | Curated TED topic slug from a 50-item dropdown (mode=byTopic) |
searchQuery | String | — | Free-text search query (mode=bySearch) |
speaker | String | — | TED speaker slug or URL (mode=bySpeaker) |
playlist | String | — | TED playlist numeric ID or full URL (mode=byPlaylist) |
sort | Enum | popular | popular / newest / oldest for mode=browse |
minViews | Integer | — | Drop talks with fewer views |
minDurationSeconds / maxDurationSeconds | Integer | — | Filter by talk length |
language | Enum | (no filter) | ISO 639-1 dropdown of TED's top-30 languages |
maxItems | Integer | 25 | Hard cap (1-1000) |
Example input — single talk
{"mode": "byUrls","talkUrls": ["sir_ken_robinson_do_schools_kill_creativity"]}
Example input — multiple talks
{"mode": "byUrls","talkUrls": ["https://www.ted.com/talks/brene_brown_the_power_of_vulnerability","https://www.ted.com/talks/simon_sinek_how_great_leaders_inspire_action","do_schools_kill_creativity"]}
Example input — browse a topic
{"mode": "byTopic","topic": "creativity","minViews": 1000000,"maxItems": 50}
Example input — short English talks
{"mode": "byTopic","topic": "psychology","language": "en","maxDurationSeconds": 600,"maxItems": 25}
Example input — search
{"mode": "bySearch","searchQuery": "quantum computing","maxItems": 30}
Example input — by speaker
{"mode": "bySpeaker","speaker": "sir_ken_robinson"}
Example input — by playlist (Most Popular Talks)
{"mode": "byPlaylist","playlist": "171","maxItems": 25}
Example input — browse most popular talks
{"mode": "browse","sort": "popular","maxItems": 25}
Example output
{"recordType": "talk","id": "66","slug": "sir_ken_robinson_do_schools_kill_creativity","url": "https://www.ted.com/talks/sir_ken_robinson_do_schools_kill_creativity","title": "Do schools kill creativity?","speaker": "Sir Ken Robinson","description": "Sir Ken Robinson makes an entertaining and profoundly moving case for creating an education system that nurtures (rather than undermines) creativity.","durationSeconds": 1151,"durationFormatted": "19:11","viewedCount": 80129749,"publishedAt": "2006-06-27T00:11:00Z","recordedOn": "2006-02-25","language": "en","curatorApproved": true,"hasTranslations": true,"topics": ["education", "creativity", "psychology"],"thumbnailUrl": "https://pi.tedcdn.com/r/talkstar-photos.s3.amazonaws.com/uploads/...","scrapedAt": "2026-05-06T10:42:18Z"}
Use cases
- Educational platform content discovery — Build curated talk libraries by topic.
- Corporate training catalogs — Index TED talks by length, speaker, or topic for L&D programs.
- Content recommendation engines — Match TED talks to user interests via topics + view counts.
- Speaker / influencer research — Track TED appearances of public figures.
- Academic research — Snapshot communication / public-speaking dataset.
FAQ
Do I need a TED account or API key?
No. TED.com pages embed full talk metadata in their __NEXT_DATA__ JSON blob; the actor reads that directly.
How do I find a topic slug?
Browse https://www.ted.com/topics — every topic page URL ends in the slug (e.g. /topics/creativity → creativity). Common slugs: creativity, business, education, psychology, technology, science, health, culture, art.
Are transcripts included?
Not in this version. Talk transcripts live on a separate /transcript URL and are translated into many languages; capturing them requires an additional fetch per talk and per language.
Why does mode=byTopic make N+1 requests?
TED's topic listing returns lighter data per talk. To get full metadata (views, duration, language, etc.), the actor visits each talk's individual URL. Use mode=byUrls if you already have the slugs and want fewer round-trips.
How current is the data? Live — every run hits TED.com at request time. Schedule the actor for daily / weekly refreshes to track view-count growth.
Do I need a proxy? No. TED.com accepts datacenter IPs without restriction.
Limitations
- TED's topic pages return ~16-20 talks per topic; large catalogs need multiple topic queries.
- Transcripts are not yet captured.
- Per-talk comments / reactions are not exposed in the public data.
- Some old talks have sparse metadata (no language, no
recordedOn, etc.).