Spotify Podcast & Episode Scraper
Pricing
Pay per usage
Spotify Podcast & Episode Scraper
Extract Spotify podcast data: show info, episode lists, descriptions, durations, release dates, and publisher details. Scrape any public podcast via URL or search. Uses oEmbed and embed APIs. Export JSON, CSV, Excel. Pay per episode scraped.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Ricardo Akiyoshi
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 hours ago
Last modified
Categories
Share
Scrape Spotify podcasts and episodes at scale. Extract show metadata, full episode lists, descriptions, durations, release dates, and publisher information from any public Spotify podcast.
What does it do?
This actor extracts structured data from Spotify podcasts, including:
- Show metadata: name, publisher, description, total episode count, rating, cover image
- Episode data: title, description, duration, release date, URL, explicit flag
- Search: find podcasts by keyword and scrape their episodes
- Multiple strategies: oEmbed API, embed pages, direct HTML scraping, and internal API
No Spotify account or API key required — uses only publicly accessible endpoints.
Use Cases
Podcast Research & Analytics
Analyze podcast landscapes in any niche. Track episode frequency, show growth, and content trends across hundreds of podcasts simultaneously.
Competitive Intelligence
Monitor competing podcasts in your space. Compare episode counts, release cadence, topic coverage, and audience engagement.
Content Planning
Research what topics successful podcasts cover. Find gaps in content coverage for your own show planning and editorial calendar.
Media Monitoring
Track brand mentions and industry discussions across podcast descriptions and episode titles. Build a structured database of podcast content.
Market Research
Identify trending podcast categories, popular formats, and emerging topics. Analyze the podcast ecosystem with structured, exportable data.
Academic & Data Analysis
Collect large-scale podcast metadata for research purposes. Study publishing patterns, content trends, and platform dynamics.
SEO & Marketing
Extract podcast metadata for backlink research, guest identification, and cross-promotion opportunities in your niche.
Input
| Field | Type | Default | Description |
|---|---|---|---|
podcastUrls | array | - | List of Spotify podcast URLs (open.spotify.com/show/...) |
searchTerms | array | - | Search keywords to find podcasts |
maxEpisodes | integer | 200 | Max episodes per show (0 = unlimited) |
proxy | object | - | Optional proxy configuration |
Example: Scrape by URL
{"podcastUrls": ["https://open.spotify.com/show/4rOoJ6Egrf8K2IrywzwOMk","https://open.spotify.com/show/2mTUnDkuKUkhiueKcVWoP0"],"maxEpisodes": 100}
Example: Search and Scrape
{"searchTerms": ["true crime", "tech news", "comedy interviews"],"maxEpisodes": 50}
Example: Combined
{"podcastUrls": ["https://open.spotify.com/show/4rOoJ6Egrf8K2IrywzwOMk"],"searchTerms": ["AI podcast"],"maxEpisodes": 200,"proxy": {"useApifyProxy": true}}
Output
Each podcast show is saved as a structured JSON object containing the show metadata and its episodes:
{"showName": "The Joe Rogan Experience","publisher": "Joe Rogan","description": "The official podcast of comedian Joe Rogan...","totalEpisodes": 2150,"rating": 4.8,"ratingCount": 125000,"coverImage": "https://i.scdn.co/image/...","categories": ["Comedy", "Society & Culture"],"language": "en","showUrl": "https://open.spotify.com/show/4rOoJ6Egrf8K2IrywzwOMk","showId": "4rOoJ6Egrf8K2IrywzwOMk","episodesScraped": 200,"episodes": [{"title": "#2100 - Elon Musk","description": "Elon Musk is the CEO of Tesla, SpaceX...","duration": "3h 12m 45s","durationMs": 11565000,"durationISO": "PT3H12M45S","releaseDate": "2026-02-14","url": "https://open.spotify.com/episode/abc123","episodeId": "abc123","isExplicit": true,"isPlayable": true,"language": "en","showId": "4rOoJ6Egrf8K2IrywzwOMk","episodeNumber": 1},{"title": "#2099 - Naval Ravikant","description": "Naval Ravikant is an entrepreneur and investor...","duration": "2h 45m 10s","durationMs": 9910000,"durationISO": "PT2H45M10S","releaseDate": "2026-02-12","url": "https://open.spotify.com/episode/def456","episodeId": "def456","isExplicit": false,"isPlayable": true,"language": "en","showId": "4rOoJ6Egrf8K2IrywzwOMk","episodeNumber": 2}],"scrapedAt": "2026-03-02T10:00:00.000Z"}
How it Works
The scraper uses multiple strategies layered for maximum data coverage:
- Spotify oEmbed API — Lightweight public endpoint for show title and thumbnail
- Embed page scraping — Parses Spotify's embed player pages for structured data in script tags
- Direct page scraping — Crawls open.spotify.com/show pages for meta tags, JSON-LD, and embedded state
- Internal API — Attempts to use Spotify's internal GraphQL API with anonymous tokens for complete episode listings
- CheerioCrawler fallback — Full crawl using Crawlee's CheerioCrawler as a last resort
Data from all strategies is merged, deduplicated, and cleaned for the best possible output.
Supported URL Formats
https://open.spotify.com/show/ABC123https://open.spotify.com/show/ABC123?si=xyzspotify:show:ABC123- Raw show ID:
ABC123
Rate Limiting
- Automatic delays between requests (0.8-1.2 seconds)
- Exponential backoff on 429 (Too Many Requests) responses
- Respects Retry-After headers
- Proxy support for higher throughput on large scrapes
Pay Per Event Pricing
This actor uses Apify's Pay Per Event model:
| Event | Price |
|---|---|
| Episode scraped | $0.004 |
You only pay for successfully scraped episodes. Show metadata is included at no extra charge.
Cost examples:
- 50 episodes from 1 podcast = $0.20
- 200 episodes from 5 podcasts = $4.00
- 1,000 episodes from 10 podcasts = $4.00
Tips for Best Results
- Use proxy for large scrapes (10+ shows or 500+ episodes total)
- Start with a small maxEpisodes (e.g., 10) to test, then scale up
- Provide direct URLs when possible — search is slower and less reliable
- Set maxEpisodes to 0 to scrape all available episodes (may be slow for shows with 1000+ episodes)
Limitations
- Only public podcasts on Spotify can be scraped
- Episode audio content is not downloaded (only metadata)
- Spotify may change their page structure, which could temporarily reduce data coverage
- Very new or obscure podcasts may have limited metadata available
- Search functionality depends on Google/Spotify search availability
- Rate limiting may slow down large-scale scrapes
Changelog
1.0.0 (2026-03-02)
- Initial release
- Multi-strategy scraping (oEmbed, embed, direct, API, CheerioCrawler)
- Podcast search by keyword
- Episode deduplication and data merging
- Pay Per Event billing ($0.004/episode)
- Proxy support
- JSON-LD and structured data parsing
- Automatic pagination for large episode lists
Integration — Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("sovereigntaylor/spotify-scraper").call(run_input={"searchTerm": "spotify","maxResults": 50})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{item.get('title', item.get('name', 'N/A'))}")
Integration — JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('sovereigntaylor/spotify-scraper').call({searchTerm: 'spotify',maxResults: 50});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => console.log(item.title || item.name || 'N/A'));