IMDb Scraper — Movies, TV Shows, Ratings, Cast & Reviews avatar

IMDb Scraper — Movies, TV Shows, Ratings, Cast & Reviews

Pricing

Pay per usage

Go to Apify Store
IMDb Scraper — Movies, TV Shows, Ratings, Cast & Reviews

IMDb Scraper — Movies, TV Shows, Ratings, Cast & Reviews

Scrape IMDb for comprehensive movie and TV show data at scale. Extract titles, IMDb ratings, vote counts, genres, directors, full cast lists, runtime, plot summaries, poster images, budget, box office gross, awards, content ratings (PG-13, R, etc.), release dates, languages, countries, production co

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ricardo Akiyoshi

Ricardo Akiyoshi

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

an hour ago

Last modified

Categories

Share

Scrape IMDb for comprehensive movie and TV show data at scale. Extract titles, ratings, cast, crew, reviews, box office figures, awards, and dozens more fields from any search query.

What does IMDb Scraper do?

This actor searches IMDb for movies, TV shows, or any entertainment content and extracts detailed structured data from each title page. It navigates from search results to individual title pages, pulling data from multiple sources on each page (JSON-LD structured data, Next.js hydration payloads, and DOM parsing) for maximum reliability.

Key capabilities

  • Search any keyword, title, person name, or genre on IMDb
  • Filter results by content type: movies only, TV shows only, or all
  • Sort by relevance, IMDb rating, popularity, release date, or vote count
  • Extract 40+ fields per title including ratings, cast, box office, and awards
  • User reviews with ratings, dates, helpfulness votes, and spoiler flags
  • Automatic pagination across search results and review pages
  • Bot detection handling with automatic retries and realistic browser headers
  • PPE pricing — only pay for successfully scraped titles ($0.004/title)

Use cases

Entertainment data analysis

Build datasets of movie ratings, box office performance, and audience reception across genres, decades, or studios. Compare how different franchises perform, track rating trends over time, or analyze which genres dominate each year.

Content recommendation engines

Feed structured movie and TV show data into recommendation algorithms. Use ratings, genres, cast overlap, director filmographies, and user review sentiment to power personalized content suggestions.

Film industry research

Researchers and journalists can gather comprehensive data on production companies, budgets vs. box office returns, award correlations with ratings, and career trajectories of directors and actors.

Box office tracking and forecasting

Extract budget and worldwide gross data to build financial models. Analyze opening weekend performance vs. total gross, compare marketing spend to returns, and identify patterns in successful releases.

Academic and market research

Universities and research organizations can build large-scale entertainment datasets for studying cultural trends, representation in media, audience preferences across demographics, and the economics of film production.

Competitive intelligence for streaming platforms

Analyze which titles have the highest ratings and vote counts to understand audience demand. Track upcoming releases, monitor content rating distributions, and identify gaps in content libraries.

Sentiment analysis

Extract user reviews with ratings and helpfulness scores to perform natural language processing. Understand audience sentiment, identify common complaints, and track how reception changes over a title's lifetime.

Data journalism

Power visual stories about the film industry with structured data. Create interactive visualizations of Oscar trends, box office records, franchise performance, and genre popularity shifts.

Input configuration

ParameterTypeDefaultDescription
searchQueryStringRequiredKeyword, title, person name, or genre to search
typeEnum"all"Filter: "movie", "tv", or "all"
maxResultsInteger50Maximum number of titles to scrape (1-5,000)
includeReviewsBooleanfalseAlso scrape up to 25 user reviews per title
sortByEnum"relevance"Sort: "relevance", "rating", "popularity", "newest", "votes"
maxConcurrencyInteger3Concurrent requests (1-10)
maxRetriesInteger5Retries per failed request (1-10)
proxyConfigurationObjectnullApify proxy settings (recommended for 100+ titles)

Example input

{
"searchQuery": "Christopher Nolan",
"type": "movie",
"maxResults": 20,
"includeReviews": true,
"sortBy": "rating",
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Output format

Each scraped title produces a JSON object with the following fields:

{
"imdbId": "tt0468569",
"url": "https://www.imdb.com/title/tt0468569/",
"title": "The Dark Knight",
"alternateTitle": "",
"year": "2008",
"type": "movie",
"contentRating": "PG-13",
"imdbRating": 9.0,
"imdbVotes": 2800000,
"metacriticScore": 84,
"genres": ["Action", "Crime", "Drama"],
"director": "Christopher Nolan",
"directors": [
{
"name": "Christopher Nolan",
"imdbId": "nm0634240",
"url": "https://www.imdb.com/name/nm0634240/"
}
],
"creators": [
{
"name": "Jonathan Nolan",
"imdbId": "nm0634300",
"url": "https://www.imdb.com/name/nm0634300/"
},
{
"name": "Christopher Nolan",
"imdbId": "nm0634240",
"url": "https://www.imdb.com/name/nm0634240/"
}
],
"cast": [
{
"name": "Christian Bale",
"imdbId": "nm0000288",
"url": "https://www.imdb.com/name/nm0000288/"
},
{
"name": "Heath Ledger",
"imdbId": "nm0005132",
"url": "https://www.imdb.com/name/nm0005132/"
},
{
"name": "Aaron Eckhart",
"imdbId": "nm0001173",
"url": "https://www.imdb.com/name/nm0001173/"
}
],
"runtime": "2h 32min",
"runtimeSeconds": 9120,
"plot": "When a menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman, James Gordon and Harvey Dent must work together to put an end to the madness.",
"posterUrl": "https://m.media-amazon.com/images/M/...",
"budget": "$185,000,000",
"openingWeekendUSA": "$158,411,483",
"grossUSA": "$533,345,358",
"cumulativeWorldwideGross": "$1,006,234,167",
"awards": "Won 2 Oscars. 163 wins & 164 nominations total",
"awardsWins": 163,
"awardsNominations": 164,
"oscarWins": 2,
"oscarNominations": null,
"releaseDate": "July 18, 2008 (United States)",
"languages": ["English", "Mandarin"],
"countries": ["United States", "United Kingdom"],
"productionCompanies": [
{
"name": "Warner Bros.",
"imdbId": "co0002663",
"url": "https://www.imdb.com/company/co0002663/"
},
{
"name": "Legendary Entertainment",
"imdbId": "co0159111",
"url": "https://www.imdb.com/company/co0159111/"
}
],
"filmingLocations": ["Chicago, Illinois, USA", "London, England, UK"],
"alsoKnownAs": "",
"keywords": ["superhero", "dc comics", "joker", "batman", "sequel"],
"totalSeasons": null,
"totalEpisodes": null,
"trailerUrl": "",
"trailerName": "",
"productionStatus": "",
"scrapedAt": "2026-03-02T12:00:00.000Z"
}

Reviews output (when includeReviews is enabled)

Reviews are stored as separate dataset entries linked by imdbId:

{
"imdbId": "tt0468569",
"parentTitle": "The Dark Knight",
"dataType": "reviews",
"reviewCount": 25,
"averageReviewRating": 8.7,
"reviews": [
{
"reviewTitle": "A masterpiece of modern cinema",
"reviewRating": 10,
"reviewText": "Heath Ledger's performance as the Joker is...",
"author": "moviefan123",
"authorUrl": "https://www.imdb.com/user/ur12345678/",
"date": "18 July 2008",
"helpfulVotes": 1234,
"totalVotes": 1456,
"hasSpoiler": false
}
],
"scrapedAt": "2026-03-02T12:00:05.000Z"
}

Extraction strategies

The scraper uses three independent extraction strategies and merges results for maximum reliability:

  1. JSON-LD (schema.org) — IMDb embeds structured schema.org/Movie or schema.org/TVSeries data in <script type="application/ld+json"> tags. This is the most standardized and reliable source for core fields like title, rating, director, cast, and description.

  2. NEXT_DATA — IMDb is built with Next.js and includes a <script id="__NEXT_DATA__"> tag with the server-side rendered data payload. This contains the most complete data including runtime in seconds, metacritic scores, production status, season/episode counts, countries, languages, and filming locations.

  3. DOM parsing — Direct Cheerio-based HTML scraping using data-testid attributes and CSS selectors. This serves as the ultimate fallback and is most resilient to data structure changes since it reads what the user sees. Handles multiple selector strategies per field.

The merge function takes the best available value for each field using priority order: JSON-LD > NEXT_DATA > DOM.

Performance and limits

MetricValue
Speed15-40 titles/minute (with proxies)
Speed5-15 titles/minute (without proxies)
Max results5,000 titles per run
Reviews per titleUp to 25
Concurrency1-10 (default: 3)
Retries1-10 per request (default: 5)
Bot detectionAutomatic retry with header rotation

Cost estimation (PPE pricing)

TitlesReviewsEstimated cost
10No$0.04
50No$0.20
100Yes$0.40
500No$2.00
1,000Yes$4.00
5,000No$20.00

Plus Apify platform usage costs (compute units, proxy bandwidth).

Tips for best results

  • Use proxies for runs with 100+ titles. IMDb may rate-limit after many requests from the same IP.
  • Start small with 10-20 titles to verify the data you need, then scale up.
  • Type filtering at the actor level is more efficient than filtering a large "all" result set.
  • Reviews add time — each title with reviews enabled requires an extra page load. Only enable when you need sentiment data.
  • Concurrency of 3 is a good balance between speed and reliability. Going above 5 increases the risk of rate limiting.
  • Sort by votes to get the most well-known titles first — useful when you want popular content for recommendation engines.

Integrations

Export your results to any format or destination:

  • JSON, CSV, Excel — download directly from the Apify dataset
  • Google Sheets — automatic sync via Apify integrations
  • Webhooks — trigger downstream processing when the run completes
  • API — access results programmatically via the Apify API
  • Zapier / Make — connect to 5,000+ apps for automated workflows
  • Amazon S3, Google Cloud Storage — store large datasets in the cloud

Changelog

v1.0.0 (2026-03-02)

  • Initial release
  • Three-strategy extraction (JSON-LD, NEXT_DATA, DOM)
  • Search, title detail, and reviews page handling
  • PPE pricing at $0.004 per title
  • 12 User-Agent rotation, realistic browser headers
  • Support for movies, TV shows, and all content types
  • Box office, awards, production company extraction
  • Automatic pagination and bot detection handling

License

MIT License. Built by Sovereign AI.