๐ป NPR Scraper โ News & Podcast Transcripts
Pricing
from $3.00 / 1,000 results
๐ป NPR Scraper โ News & Podcast Transcripts
Extract articles & content from NPR โ news stories, podcast episodes & transcripts. Build media monitoring, content analysis & journalism research tools. Pay per article.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Stephan Corbeil
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
0
Monthly active users
a day ago
Last modified
Share
๐ป NPR Scraper โ News Articles, Podcast Transcripts & Show Metadata
Bulk-extract articles, podcast episodes, transcripts, and show metadata from NPR.org and NPR member stations: headline, byline, dek, full article body, publication date, primary topic, image URLs, audio URLs, transcript text, and show / program affiliation. A pay-per-result alternative to Diffbot's news API, NewsAPI.org Premium, Webhose.io, and the NPR private partner API โ built for media-monitoring firms, NLP researchers training news-domain models, content-aggregator startups, and political-comms teams tracking national news coverage.
Why NPR Scraper Beats Diffbot News, NewsAPI.org, Webhose.io & NPR Partner API
| Feature | NexGenData NPR Scraper | Diffbot News API | NewsAPI.org Premium | Webhose.io | NPR Partner API |
|---|---|---|---|---|---|
| Cost | $1 per 1K articles, pay-per-event | $299+ / month | $449+ / month | $$$$ enterprise | Partner contract |
| NPR-specific coverage | Yes โ full NPR.org + member stations | Generic news (NPR included) | NPR via aggregator | NPR via aggregator | Yes (partner-only) |
| Full article body | Yes | Yes | Headline + url only | Yes | Yes |
| Podcast audio + transcript | Yes โ audio_url + transcript_text | No | No | No | Yes |
| Bulk export | JSON / CSV / Excel | JSON / CSV (plan-gated) | JSON | JSON / CSV | Partner-only |
| Auth | Apify token | API key + plan | API key + plan | Account + plan | Partnership |
| Historical archive | 5+ years | 30 days default | 30 days (free) | Yes (plan-gated) | Yes |
| Monthly minimum | None | $299+ | $449+ | $$$$ | Partnership contract |
Most media-intel teams pick this actor instead of Diffbot's news API for NPR-specific workflows because it is a drop-in alternative that returns NPR transcripts (which Diffbot does not) and is cheaper than NewsAPI Premium for any depth beyond headlines โ and it doesn't require an NPR partnership contract.
What You Get Per Story
Each dataset item is a flat record:
url,npr_story_idheadline,dekโ subheadline / standfirstbyline[]โ author(s) with bio linkpublished_at,updated_atโ ISO 8601bodyโ full article text (HTML or markdown)body_paragraphs[]โ array of paragraph strings for easy NLPtopics[]โ NPR topic tags (Politics, Business, Music, etc.)showโ program affiliation (Morning Edition, All Things Considered, etc.)audio_urlโ direct MP3 link for the corresponding podcast segmentaudio_duration_secondstranscript_textโ full transcript when availableimages[]โ{url, caption, credit}pull_quotes[]related_stories[]โ NPR-internal cross-linksmember_stationโ local NPR station that produced piece, if any
Use Cases
- Media monitoring firms โ daily ingest of all NPR coverage matching a client's keywords + competitors
- NLP / LLM training teams โ build a high-quality, professionally-edited news corpus with paired audio
- Content aggregators โ power topic pages that surface the best NPR coverage on, say, the Fed
- Political comms / PR teams โ track how a brand or issue gets framed in NPR's national audience
- Academic researchers โ study media-language evolution by analyzing NPR coverage across decades
- Educators โ generate classroom materials by pulling story + transcript + audio in one record
Quick Start
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("nexgendata/npr-scraper").call(run_input={"queries": ["Federal Reserve", "AI regulation"],"topics": ["politics", "economy"],"since": "2026-04-01","maxStories": 200})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["published_at"], item["headline"])
Pricing
Pay-per-event:
- Actor Start: small fixed charge per run (memory-scaled)
- Per story: $1 per 1,000 stories returned
No subscription, no minimum.
Related NexGenData Actors
| Use case | Actor |
|---|---|
| Podcast episode metadata + audio URLs | podcast-episodes-scraper |
| News content + sentiment MCP for AI | news-mcp-server |
| AI sentiment + theme analyzer | ai-sentiment-analyzer |
| Hacker News scraper | hacker-news-scraper |
| Google News / Search SERP | google-search-scraper |
| Reddit subreddit trend tracker | reddit-subreddit-trends |
| Crunchbase news scraper | crunchbase-news-scraper |
| YouTube channel + video metadata MCP | youtube-media-mcp-server |
FAQ
Are NPR member-station stories included?
Yes โ pieces from member stations that publish to npr.org show up with the member_station field populated.
How deep does the archive go? NPR's web archive is robust back to 2005-ish; the actor will return any story still online.
Are podcast transcripts always present? Where NPR publishes one โ which is most flagship shows (Morning Edition, All Things Considered, NPR Politics, Throughline, etc.).
Output formats? JSON, CSV, Excel, and the Apify dataset API.
Is this legal? Yes. NPR publishes all this content for public consumption; the actor is a structured-extraction wrapper.
About NexGenData
NexGenData publishes 260+ buyer-intent actors covering SEC filings, YC alumni, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, and more. All pay-per-result. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b
How NexGenData Pricing Works
Every NexGenData actor uses pay-per-event pricing โ you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.
- Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
- Result / item: charged per item written to the default dataset
- No charge for retries, internal proxy rotation, or failed sub-requests โ those are absorbed by the platform
Apify Platform Bonus
New to Apify? Sign up with the NexGenData referral link โ you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.
Integration Surface
Every actor in the NexGenData catalog can be triggered from:
- Apify console โ point-and-click run
- Apify API โ REST + webhooks
- Apify Python / JS SDKs โ programmatic batch
- Zapier, Make.com, n8n โ official integrations
- MCP โ many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
- Schedules โ built-in cron for daily / weekly / monthly runs
- Webhooks โ POST results to any HTTPS endpoint on dataset write
Support
NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome โ high-demand features ship in the next version.
Home: thenextgennexus.com Full catalog: apify.com/nexgendata