IMDb Scraper — Movies, Ratings & Top Charts avatar

IMDb Scraper — Movies, Ratings & Top Charts

Pricing

$10.00 / 1,000 result scrapeds

Go to Apify Store
IMDb Scraper — Movies, Ratings & Top Charts

IMDb Scraper — Movies, Ratings & Top Charts

Scrape IMDb movie and TV show data without authentication. Extract titles, ratings, genres, cast, directors, runtime, and reviews. Search by keyword, genre, or year. Monitor trending content. Handles IMDb Top 250 and full catalog. Export to JSON/CSV.

Pricing

$10.00 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

Web Data Labs

Web Data Labs

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

20 hours ago

Last modified

Categories

Share

IMDb Scraper — Movies, TV Shows, Ratings & Charts

Scrape IMDb for comprehensive movie and TV show data. Search for titles, get detailed information including ratings, cast, plot, box office numbers, or fetch the Top 250 and Most Popular charts — all from the world's largest movie database.

Why scrape IMDb?

IMDb is the definitive source for entertainment data with over 10 million titles and 83 million registered users. Whether you're building a recommendation engine, conducting market research, or analyzing entertainment trends, IMDb data is essential:

  • Movie research & analytics — Track ratings, box office performance, and audience reception over time
  • Content recommendation systems — Build recommendation engines using ratings, genres, and cast data
  • Entertainment industry intelligence — Monitor what's trending, what's climbing the charts, and audience sentiment
  • Academic research — Film studies, cultural analysis, and media research datasets
  • Data journalism — Stories about the film industry backed by comprehensive data
  • Portfolio building — Curate and showcase movie collections with rich metadata
  • Market analysis — Compare box office performance, budget-to-revenue ratios, and genre trends
  • Watchlist curation — Build smart watchlists based on ratings, genres, directors, or cast

Features

1. Search (action: "search")

Search IMDb's catalog of movies and TV shows. Returns structured results with ratings, genres, and poster images.

Input:

{
"action": "search",
"query": "Interstellar",
"type": "movie",
"maxItems": 10
}

Output fields: imdbId, title, type, year, rating, ratingCount, genres, plot, poster, url

2. Title Details (action: "title")

Get comprehensive details for any movie or TV show. Accepts either an IMDb URL or title ID.

Input (by URL):

{
"action": "title",
"url": "https://www.imdb.com/title/tt0816692/"
}

Input (by ID):

{
"action": "title",
"url": "tt0816692"
}

Output fields: imdbId, title, type, year, datePublished, contentRating, rating, ratingCount, plot, genres, director, creator, cast (top 5), keywords, poster, url, runtimeMinutes, languages, budget, boxOffice

Example output:

{
"imdbId": "tt0816692",
"title": "Interstellar",
"type": "Movie",
"year": "2014",
"datePublished": "2014-11-07",
"contentRating": "PG-13",
"rating": 8.7,
"ratingCount": 2497887,
"plot": "When Earth becomes uninhabitable in the future, a farmer and ex-NASA pilot is tasked to pilot a spacecraft to find a new planet for humans.",
"genres": ["Adventure", "Drama", "Sci-Fi"],
"director": ["Christopher Nolan"],
"cast": ["Matthew McConaughey", "Anne Hathaway", "Jessica Chastain", "Mackenzie Foy", "Ellen Burstyn"],
"runtimeMinutes": 169,
"languages": ["English"],
"budget": "$165,000,000",
"boxOffice": "$677,463,813",
"poster": "https://m.media-amazon.com/images/M/...",
"url": "https://www.imdb.com/title/tt0816692/"
}

3. Top Charts (action: "top-chart")

Fetch IMDb's curated charts — the legendary Top 250 or the Most Popular movies right now.

Input:

{
"action": "top-chart",
"chart": "top250",
"maxItems": 50
}

Output fields: rank, imdbId, title, type, year, rating, ratingCount, genres, poster, url

How it works

This actor uses IMDb's embedded structured data for maximum reliability:

  1. JSON-LD (application/ld+json) — IMDb embeds Schema.org structured data in every title page containing name, description, ratings, cast, director, and more. This is the same data Google uses for rich search results.

  2. __NEXT_DATA__ — IMDb's Next.js frontend embeds the full page data as JSON, which we extract for search results, chart listings, and additional title details (budget, box office, languages, runtime).

This approach is significantly more reliable than HTML parsing because structured data formats rarely change even when the visual design is updated.

Use cases

Entertainment data pipelines

Build automated pipelines that track new releases, monitor rating changes, or aggregate box office data across hundreds of titles.

Movie recommendation APIs

Power your recommendation engine with rich IMDb metadata — combine ratings, genres, cast overlap, and director filmographies to suggest the perfect next watch.

Film industry dashboards

Create dashboards showing trending movies, genre performance over time, or director/actor career trajectories based on IMDb ratings.

Academic & research datasets

Generate clean, structured datasets for film studies, cultural analysis, NLP training data (plot descriptions), or media consumption research.

Content aggregation

Build entertainment portals, review aggregators, or streaming guide apps with comprehensive movie metadata from IMDb.

Competitive analysis

Track how movies perform relative to their budgets, compare franchise performance, or analyze seasonal release patterns.

Input schema

FieldTypeRequiredDefaultDescription
actionstringYes"search"Action to perform: search, title, or top-chart
querystringFor searchSearch query string
typestringNo"movie"Title type filter: movie, tv, or all
urlstringFor titleIMDb URL or title ID (e.g., tt0816692)
chartstringNo"top250"Chart to fetch: top250 or popular
maxItemsintegerNo25Maximum results (1-250)

Output format

All results are pushed to the default dataset. Each item is a JSON object with fields specific to the action used. See the feature sections above for detailed field descriptions.

Rate limiting & best practices

  • The actor uses a single HTTP request per search/title/chart operation — no crawling or spidering
  • IMDb pages are fetched with standard browser-like headers
  • For bulk operations, consider adding delays between runs to be respectful of IMDb's servers
  • Results are extracted from structured data (JSON-LD and __NEXT_DATA__), not screen-scraped HTML

Technical details

  • Language: Python 3
  • Dependencies: httpx (async HTTP), beautifulsoup4 (HTML parsing), apify (Actor SDK)
  • Data sources: JSON-LD structured data, Next.js __NEXT_DATA__
  • No browser required — pure HTTP requests, no Playwright or Puppeteer needed
  • Fast execution — typically completes in 1-3 seconds per request

Example integrations

Python

from apify_client import ApifyClient
client = ApifyClient("your_api_token")
# Search for movies
run = client.actor("cryptosignals/imdb-scraper").call(run_input={
"action": "search",
"query": "Christopher Nolan",
"type": "movie",
"maxItems": 10,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['title']} ({item['year']}) - {item['rating']}/10")

JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'your_api_token' });
const run = await client.actor('cryptosignals/imdb-scraper').call({
action: 'title',
url: 'tt0816692',
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0]);

cURL (API)

curl "https://api.apify.com/v2/acts/cryptosignals~imdb-scraper/runs?token=YOUR_TOKEN" \
-X POST \
-d '{"action": "top-chart", "chart": "top250", "maxItems": 10}' \
-H 'Content-Type: application/json'

Changelog

  • v0.1 — Initial release: search, title details, and top chart support

Using proxies

IMDb (owned by Amazon) applies sophisticated bot detection that blocks datacenter IPs and rate-limits automated requests, returning CAPTCHAs or 503 errors during bulk scraping. Residential proxies use real ISP addresses that IMDb's detection systems treat as normal browser traffic. ThorData provides 200M+ residential IPs that reliably bypass Amazon's anti-bot infrastructure.