Spotify Podcast & Episode Scraper avatar

Spotify Podcast & Episode Scraper

Pricing

Pay per usage

Go to Apify Store
Spotify Podcast & Episode Scraper

Spotify Podcast & Episode Scraper

Extract Spotify podcast data: show info, episode lists, descriptions, durations, release dates, and publisher details. Scrape any public podcast via URL or search. Uses oEmbed and embed APIs. Export JSON, CSV, Excel. Pay per episode scraped.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ricardo Akiyoshi

Ricardo Akiyoshi

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

2 hours ago

Last modified

Categories

Share

Scrape Spotify podcasts and episodes at scale. Extract show metadata, full episode lists, descriptions, durations, release dates, and publisher information from any public Spotify podcast.

What does it do?

This actor extracts structured data from Spotify podcasts, including:

  • Show metadata: name, publisher, description, total episode count, rating, cover image
  • Episode data: title, description, duration, release date, URL, explicit flag
  • Search: find podcasts by keyword and scrape their episodes
  • Multiple strategies: oEmbed API, embed pages, direct HTML scraping, and internal API

No Spotify account or API key required — uses only publicly accessible endpoints.

Use Cases

Podcast Research & Analytics

Analyze podcast landscapes in any niche. Track episode frequency, show growth, and content trends across hundreds of podcasts simultaneously.

Competitive Intelligence

Monitor competing podcasts in your space. Compare episode counts, release cadence, topic coverage, and audience engagement.

Content Planning

Research what topics successful podcasts cover. Find gaps in content coverage for your own show planning and editorial calendar.

Media Monitoring

Track brand mentions and industry discussions across podcast descriptions and episode titles. Build a structured database of podcast content.

Market Research

Identify trending podcast categories, popular formats, and emerging topics. Analyze the podcast ecosystem with structured, exportable data.

Academic & Data Analysis

Collect large-scale podcast metadata for research purposes. Study publishing patterns, content trends, and platform dynamics.

SEO & Marketing

Extract podcast metadata for backlink research, guest identification, and cross-promotion opportunities in your niche.

Input

FieldTypeDefaultDescription
podcastUrlsarray-List of Spotify podcast URLs (open.spotify.com/show/...)
searchTermsarray-Search keywords to find podcasts
maxEpisodesinteger200Max episodes per show (0 = unlimited)
proxyobject-Optional proxy configuration

Example: Scrape by URL

{
"podcastUrls": [
"https://open.spotify.com/show/4rOoJ6Egrf8K2IrywzwOMk",
"https://open.spotify.com/show/2mTUnDkuKUkhiueKcVWoP0"
],
"maxEpisodes": 100
}

Example: Search and Scrape

{
"searchTerms": ["true crime", "tech news", "comedy interviews"],
"maxEpisodes": 50
}

Example: Combined

{
"podcastUrls": [
"https://open.spotify.com/show/4rOoJ6Egrf8K2IrywzwOMk"
],
"searchTerms": ["AI podcast"],
"maxEpisodes": 200,
"proxy": {
"useApifyProxy": true
}
}

Output

Each podcast show is saved as a structured JSON object containing the show metadata and its episodes:

{
"showName": "The Joe Rogan Experience",
"publisher": "Joe Rogan",
"description": "The official podcast of comedian Joe Rogan...",
"totalEpisodes": 2150,
"rating": 4.8,
"ratingCount": 125000,
"coverImage": "https://i.scdn.co/image/...",
"categories": ["Comedy", "Society & Culture"],
"language": "en",
"showUrl": "https://open.spotify.com/show/4rOoJ6Egrf8K2IrywzwOMk",
"showId": "4rOoJ6Egrf8K2IrywzwOMk",
"episodesScraped": 200,
"episodes": [
{
"title": "#2100 - Elon Musk",
"description": "Elon Musk is the CEO of Tesla, SpaceX...",
"duration": "3h 12m 45s",
"durationMs": 11565000,
"durationISO": "PT3H12M45S",
"releaseDate": "2026-02-14",
"url": "https://open.spotify.com/episode/abc123",
"episodeId": "abc123",
"isExplicit": true,
"isPlayable": true,
"language": "en",
"showId": "4rOoJ6Egrf8K2IrywzwOMk",
"episodeNumber": 1
},
{
"title": "#2099 - Naval Ravikant",
"description": "Naval Ravikant is an entrepreneur and investor...",
"duration": "2h 45m 10s",
"durationMs": 9910000,
"durationISO": "PT2H45M10S",
"releaseDate": "2026-02-12",
"url": "https://open.spotify.com/episode/def456",
"episodeId": "def456",
"isExplicit": false,
"isPlayable": true,
"language": "en",
"showId": "4rOoJ6Egrf8K2IrywzwOMk",
"episodeNumber": 2
}
],
"scrapedAt": "2026-03-02T10:00:00.000Z"
}

How it Works

The scraper uses multiple strategies layered for maximum data coverage:

  1. Spotify oEmbed API — Lightweight public endpoint for show title and thumbnail
  2. Embed page scraping — Parses Spotify's embed player pages for structured data in script tags
  3. Direct page scraping — Crawls open.spotify.com/show pages for meta tags, JSON-LD, and embedded state
  4. Internal API — Attempts to use Spotify's internal GraphQL API with anonymous tokens for complete episode listings
  5. CheerioCrawler fallback — Full crawl using Crawlee's CheerioCrawler as a last resort

Data from all strategies is merged, deduplicated, and cleaned for the best possible output.

Supported URL Formats

  • https://open.spotify.com/show/ABC123
  • https://open.spotify.com/show/ABC123?si=xyz
  • spotify:show:ABC123
  • Raw show ID: ABC123

Rate Limiting

  • Automatic delays between requests (0.8-1.2 seconds)
  • Exponential backoff on 429 (Too Many Requests) responses
  • Respects Retry-After headers
  • Proxy support for higher throughput on large scrapes

Pay Per Event Pricing

This actor uses Apify's Pay Per Event model:

EventPrice
Episode scraped$0.004

You only pay for successfully scraped episodes. Show metadata is included at no extra charge.

Cost examples:

  • 50 episodes from 1 podcast = $0.20
  • 200 episodes from 5 podcasts = $4.00
  • 1,000 episodes from 10 podcasts = $4.00

Tips for Best Results

  1. Use proxy for large scrapes (10+ shows or 500+ episodes total)
  2. Start with a small maxEpisodes (e.g., 10) to test, then scale up
  3. Provide direct URLs when possible — search is slower and less reliable
  4. Set maxEpisodes to 0 to scrape all available episodes (may be slow for shows with 1000+ episodes)

Limitations

  • Only public podcasts on Spotify can be scraped
  • Episode audio content is not downloaded (only metadata)
  • Spotify may change their page structure, which could temporarily reduce data coverage
  • Very new or obscure podcasts may have limited metadata available
  • Search functionality depends on Google/Spotify search availability
  • Rate limiting may slow down large-scale scrapes

Changelog

1.0.0 (2026-03-02)

  • Initial release
  • Multi-strategy scraping (oEmbed, embed, direct, API, CheerioCrawler)
  • Podcast search by keyword
  • Episode deduplication and data merging
  • Pay Per Event billing ($0.004/episode)
  • Proxy support
  • JSON-LD and structured data parsing
  • Automatic pagination for large episode lists

Integration — Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("sovereigntaylor/spotify-scraper").call(run_input={
"searchTerm": "spotify",
"maxResults": 50
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item.get('title', item.get('name', 'N/A'))}")

Integration — JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('sovereigntaylor/spotify-scraper').call({
searchTerm: 'spotify',
maxResults: 50
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => console.log(item.title || item.name || 'N/A'));