Lex Fridman Podcast Transcript Scraper — Speakers & Chapters
Pricing
from $1.00 / 1,000 per record returneds
Lex Fridman Podcast Transcript Scraper — Speakers & Chapters
Extract full, speaker-attributed, timestamped transcripts of the Lex Fridman Podcast — no login, no ASR. By episode URL or crawl every episode: transcript text, {speaker, time, text} segments, chapters, guest & episode number. $2 per 1,000 episodes.
Pricing
from $1.00 / 1,000 per record returneds
Rating
0.0
(0)
Developer
Scrapers Delight
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
🎙️ Lex Fridman Podcast Transcript Scraper
Get full, speaker-attributed, timestamped transcripts of the Lex Fridman Podcast — no login, no AI transcription. The show publishes a complete transcript for every episode, and this actor reads it: clean text, {speaker, time, text} segments, chapter list, guest, and episode number. Scrape one episode, a list, or crawl the entire archive.
Because the transcript is already published, there's no speech-to-text compute — it's fast and cheap.
What does it do?
For each episode (by URL or via a full-archive crawl) it returns:
- 📝 Full transcript (plain text) — always included
- ⏲️ Timestamped segments —
{speaker, start, text}(great for clipping/quoting) - 📑 Chapters, 🎤 guest, 🔢 episode number
No ASR, no API key — it reads the published transcript.
What data does it extract?
For every episode: url, episode_number, guest, episode_title, chapters[], transcript, segments[], segment_count, is_new (monitor), scraped_at.
Who is it for?
- ✍️ Writers, researchers & students quoting and searching long-form interviews.
- 🤖 AI / RAG builders — these are dense, high-quality conversation transcripts, ideal training/retrieval data.
- ✂️ Clippers & newsletter writers pulling timestamped quotes.
How to use it (step by step)
- Click Try for free.
- Paste an episode transcript URL (
https://lexfridman.com/{guest}-transcript) — or enable Crawl all for the full archive. - (Optional) add
segmentsto the formats. - Click Start, open the Dataset tab to view/export.
- (Optional) set Crawl all + monitorMode + a Schedule to grab each new episode automatically.
Quick start
{ "episodeUrls": ["https://lexfridman.com/jensen-huang-transcript"], "transcriptFormats": ["txt", "segments"] }
Input
| Field | What it does |
|---|---|
episodeUrls | transcript URLs or {guest} slugs |
crawlAll | crawl the whole podcast archive |
transcriptFormats | txt · segments |
maxEpisodes | hard cap per run (0 = all) |
monitorMode, alertOnNewEpisode | recurring new-episode watcher + alerts |
webhookUrl, slackWebhookUrl, emailRecipients | alert channels |
proxyConfiguration, requestConcurrency | proxy + parallelism |
Output
Each episode is one dataset record (fields above). Export to JSON, CSV, Excel, HTML, or RSS, or fetch via the Apify API.
How much does it cost?
Pay-per-event — and with no transcription compute, it's cheap:
| Event | What it covers | Suggested price |
|---|---|---|
lot-scraped | each episode returned | ~$0.005 / episode |
lot-detail-enriched | each transcript fetched | ~$0.005 / episode |
monitor-run-completed | each scheduled watch run | ~$0.05 / run |
new-lot-detected | each new episode | ~$0.02 / episode |
alert-delivered | each Slack/email/webhook push | ~$0.005 / alert |
(Final per-event prices are set on the actor's pricing page.)
Is it legal to scrape these transcripts?
This actor reads publicly published transcripts on lexfridman.com. The content is the creator's (copyrighted). Scraping public pages is generally legal, but you are responsible for your use — review the site's terms and respect the creator's rights; don't redistribute transcripts you're not licensed to.
FAQ
Is there a Whisper/ASR step? No — it reads the published transcript, so it's fast and cheap.
Can I get the whole back catalog? Yes — enable Crawl all to scrape every episode linked from the podcast page.
Do I get speaker labels and timestamps?
Yes — add segments for {speaker, start, text} per line.
How do I export? JSON, CSV, Excel, HTML, or RSS from the Dataset tab, or via the Apify API.
Feedback
Want chapter-aligned segments or per-guest filtering? Open an issue on the actor.