TMDB AI-Powered Movie & TV Scraper avatar

TMDB AI-Powered Movie & TV Scraper

Pricing

from $1.00 / 1,000 results

Go to Apify Store
TMDB AI-Powered Movie & TV Scraper

TMDB AI-Powered Movie & TV Scraper

Scrape full TMDB data for movies, TV shows, cast, seasons, and episodes using normal search terms/start URLs, or run AI search that turns a vibe query into precise titles automatically.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Inus Grobler

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

TMDB Scraper for Apify

Scrape TMDB movies, TV shows, people, seasons, episodes, cast, ratings, posters, trailers, keywords, and recommendations from The Movie Database (TMDB).

This Apify actor is built for people who want a TMDB scraper that is easy to run in 3 ways:

  • Search by title with searchTerms
  • Scrape known TMDB pages with startUrls
  • Turn a natural-language prompt into titles with AI, then scrape those results

What You Can Scrape

  • Movie details from TMDB
  • TV show details from TMDB
  • Person and cast profiles from TMDB
  • Seasons and episodes for TV shows
  • Full cast and crew links
  • Ratings, genres, keywords, posters, backdrops, trailers, and social links
  • AI-assisted recommendation searches based on a vibe or theme

Best Use Cases

  • Build a TMDB movie metadata dataset
  • Scrape TV shows with seasons and episodes
  • Collect actor, actress, director, and cast profile data
  • Enrich internal entertainment databases
  • Generate title lists from prompts like “dark sci-fi thrillers” and scrape the matching TMDB pages
  • Monitor specific TMDB URLs over time with startUrls

How It Works

  1. Choose an input mode:
    • searchTerms for normal TMDB search
    • startUrls for direct TMDB pages
    • enableAiSearch + aiQuery for AI-assisted title discovery
  2. Choose what entity types to keep with searchTypes
  3. Optionally enable:
    • scrapeFullCast
    • scrapeEpisodes
    • strictExactSeriesMatch
  4. Run the actor and read the dataset output

Quick Start

1. Search by title

Use this when you want to search TMDB by movie, TV show, or person name.

{
"searchTerms": ["Inception", "Breaking Bad", "Tom Hanks"],
"searchTypes": ["movie", "tv", "person"],
"maxResultsPerSearchTerm": 5,
"maxItems": 100,
"language": "en-US"
}

2. Scrape direct TMDB URLs

Use this when you already know the TMDB pages you want.

{
"startUrls": [
{ "url": "https://www.themoviedb.org/movie/27205" },
{ "url": "https://www.themoviedb.org/tv/1399" },
{ "url": "https://www.themoviedb.org/person/31" }
],
"maxItems": 100,
"language": "en-US"
}

Use this when you want the scraper to translate a natural-language prompt into likely TMDB titles first.

Before using AI mode, set these actor environment variables:

  • OPENROUTER_API_KEY
  • OPENROUTER_MODEL
{
"enableAiSearch": true,
"aiQuery": "dark detective thrillers with plot twists",
"searchTypes": ["movie", "tv"],
"maxResultsPerSearchTerm": 5,
"maxItems": 150,
"language": "en-US"
}

Key Input Fields

  • searchTerms: List of titles or names to search on TMDB
  • startUrls: Direct TMDB URLs for movies, TV shows, people, season pages, or search/list pages
  • searchTypes: Filter results to movie, tv, or person
  • maxResultsPerSearchTerm: Max detail pages to enqueue from one search page
  • maxItems: Max processed requests in one run
  • scrapeFullCast: Visit full cast pages and output linked people as separate records
  • scrapeEpisodes: Visit season pages and output season and episode records
  • strictExactSeriesMatch: Force one exact TV-series match per search term
  • language: TMDB language code added to crawled URLs
  • enableAiSearch: Enable AI title generation before scraping
  • aiQuery: Natural-language query for AI mode

Output

The actor outputs structured dataset records for:

  • movie
  • tv
  • person
  • season
  • episode

Typical fields include:

  • tmdbId
  • tmdbUrl
  • title or name
  • description and overview
  • release or air dates
  • ratings and vote counts
  • genres and keywords
  • posters and backdrops
  • trailer and social links
  • cast, directors, networks, parent show references, and recommendation links

Example Output Shapes

Movie

{
"entityType": "movie",
"tmdbId": 27205,
"title": "Inception",
"releaseDate": "2010-07-15",
"ratingAverage": 8.4,
"genres": ["Action", "Science Fiction", "Adventure"],
"directors": ["Christopher Nolan"],
"topCast": ["Leonardo DiCaprio", "Joseph Gordon-Levitt", "Elliot Page"],
"posterUrl": "https://media.themoviedb.org/t/p/original/....jpg",
"tmdbUrl": "https://www.themoviedb.org/movie/27205?language=en-US"
}

TV Show

{
"entityType": "tv",
"tmdbId": 1399,
"title": "Game of Thrones",
"firstAirDate": "2011-04-17",
"lastAirDate": "2019-05-19",
"numberOfEpisodes": 73,
"numberOfSeasons": 8,
"network": "HBO",
"tmdbUrl": "https://www.themoviedb.org/tv/1399?language=en-US"
}

Person

{
"entityType": "person",
"tmdbId": 31,
"name": "Tom Hanks",
"knownFor": "Acting",
"birthday": "1956-07-09",
"placeOfBirth": "Concord, California, USA",
"tmdbUrl": "https://www.themoviedb.org/person/31?language=en-US"
}

Season / Episode

{
"entityType": "season",
"title": "Season 1",
"seasonNumber": 1,
"releaseDate": "1994-09-22",
"parentShowTitle": "Friends"
}
{
"entityType": "episode",
"title": "Pilot",
"seasonNumber": 1,
"episodeNumber": 1,
"airDate": "1994-09-22",
"parentShowTitle": "Friends"
}

AI Mode

AI mode uses OpenRouter to convert a prompt into 10 to 20 likely TMDB titles, then the actor runs the normal scraping flow on those titles.

Set these actor environment variables before running AI mode:

  1. OPENROUTER_API_KEY
  2. OPENROUTER_MODEL

Example model values depend on what you want to use through OpenRouter, such as:

  • openai/gpt-4.1-mini
  • google/gemini-2.5-flash
  • anthropic/claude-3.5-haiku

If AI mode succeeds and generated titles are enqueued, the actor calls:

await Actor.charge({ eventName: 'ai-recommendation' });

This is wrapped in try/catch, so local development will not fail if charging is unavailable.

Notes and Limitations

  • Best metadata coverage is currently with language: "en-US" because some TMDB fact labels are localized
  • AI mode runs first; if it returns no valid titles, the actor falls back to normal searchTerms and startUrls
  • AI mode requires both OPENROUTER_API_KEY and OPENROUTER_MODEL to be set in the actor environment
  • strictExactSeriesMatch=true is useful when you only want one exact TV result per search term
  • If both scrapeEpisodes=true and scrapeFullCast=true, strict exact TV-series behavior is also applied automatically

Actor Schema Files

This actor includes Apify schema files for input and output:

  • .actor/actor.json
  • .actor/output_schema.json
  • .actor/dataset_schema.json
  • INPUT_SCHEMA.json
  • OUTPUT_SCHEMA.json