IMDb Scraper — Movies, TV Shows, Ratings, Cast & Reviews
Pricing
Pay per usage
IMDb Scraper — Movies, TV Shows, Ratings, Cast & Reviews
Scrape IMDb for comprehensive movie and TV show data at scale. Extract titles, IMDb ratings, vote counts, genres, directors, full cast lists, runtime, plot summaries, poster images, budget, box office gross, awards, content ratings (PG-13, R, etc.), release dates, languages, countries, production co
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Ricardo Akiyoshi
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
an hour ago
Last modified
Categories
Share
Scrape IMDb for comprehensive movie and TV show data at scale. Extract titles, ratings, cast, crew, reviews, box office figures, awards, and dozens more fields from any search query.
What does IMDb Scraper do?
This actor searches IMDb for movies, TV shows, or any entertainment content and extracts detailed structured data from each title page. It navigates from search results to individual title pages, pulling data from multiple sources on each page (JSON-LD structured data, Next.js hydration payloads, and DOM parsing) for maximum reliability.
Key capabilities
- Search any keyword, title, person name, or genre on IMDb
- Filter results by content type: movies only, TV shows only, or all
- Sort by relevance, IMDb rating, popularity, release date, or vote count
- Extract 40+ fields per title including ratings, cast, box office, and awards
- User reviews with ratings, dates, helpfulness votes, and spoiler flags
- Automatic pagination across search results and review pages
- Bot detection handling with automatic retries and realistic browser headers
- PPE pricing — only pay for successfully scraped titles ($0.004/title)
Use cases
Entertainment data analysis
Build datasets of movie ratings, box office performance, and audience reception across genres, decades, or studios. Compare how different franchises perform, track rating trends over time, or analyze which genres dominate each year.
Content recommendation engines
Feed structured movie and TV show data into recommendation algorithms. Use ratings, genres, cast overlap, director filmographies, and user review sentiment to power personalized content suggestions.
Film industry research
Researchers and journalists can gather comprehensive data on production companies, budgets vs. box office returns, award correlations with ratings, and career trajectories of directors and actors.
Box office tracking and forecasting
Extract budget and worldwide gross data to build financial models. Analyze opening weekend performance vs. total gross, compare marketing spend to returns, and identify patterns in successful releases.
Academic and market research
Universities and research organizations can build large-scale entertainment datasets for studying cultural trends, representation in media, audience preferences across demographics, and the economics of film production.
Competitive intelligence for streaming platforms
Analyze which titles have the highest ratings and vote counts to understand audience demand. Track upcoming releases, monitor content rating distributions, and identify gaps in content libraries.
Sentiment analysis
Extract user reviews with ratings and helpfulness scores to perform natural language processing. Understand audience sentiment, identify common complaints, and track how reception changes over a title's lifetime.
Data journalism
Power visual stories about the film industry with structured data. Create interactive visualizations of Oscar trends, box office records, franchise performance, and genre popularity shifts.
Input configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
searchQuery | String | Required | Keyword, title, person name, or genre to search |
type | Enum | "all" | Filter: "movie", "tv", or "all" |
maxResults | Integer | 50 | Maximum number of titles to scrape (1-5,000) |
includeReviews | Boolean | false | Also scrape up to 25 user reviews per title |
sortBy | Enum | "relevance" | Sort: "relevance", "rating", "popularity", "newest", "votes" |
maxConcurrency | Integer | 3 | Concurrent requests (1-10) |
maxRetries | Integer | 5 | Retries per failed request (1-10) |
proxyConfiguration | Object | null | Apify proxy settings (recommended for 100+ titles) |
Example input
{"searchQuery": "Christopher Nolan","type": "movie","maxResults": 20,"includeReviews": true,"sortBy": "rating","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Output format
Each scraped title produces a JSON object with the following fields:
{"imdbId": "tt0468569","url": "https://www.imdb.com/title/tt0468569/","title": "The Dark Knight","alternateTitle": "","year": "2008","type": "movie","contentRating": "PG-13","imdbRating": 9.0,"imdbVotes": 2800000,"metacriticScore": 84,"genres": ["Action", "Crime", "Drama"],"director": "Christopher Nolan","directors": [{"name": "Christopher Nolan","imdbId": "nm0634240","url": "https://www.imdb.com/name/nm0634240/"}],"creators": [{"name": "Jonathan Nolan","imdbId": "nm0634300","url": "https://www.imdb.com/name/nm0634300/"},{"name": "Christopher Nolan","imdbId": "nm0634240","url": "https://www.imdb.com/name/nm0634240/"}],"cast": [{"name": "Christian Bale","imdbId": "nm0000288","url": "https://www.imdb.com/name/nm0000288/"},{"name": "Heath Ledger","imdbId": "nm0005132","url": "https://www.imdb.com/name/nm0005132/"},{"name": "Aaron Eckhart","imdbId": "nm0001173","url": "https://www.imdb.com/name/nm0001173/"}],"runtime": "2h 32min","runtimeSeconds": 9120,"plot": "When a menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman, James Gordon and Harvey Dent must work together to put an end to the madness.","posterUrl": "https://m.media-amazon.com/images/M/...","budget": "$185,000,000","openingWeekendUSA": "$158,411,483","grossUSA": "$533,345,358","cumulativeWorldwideGross": "$1,006,234,167","awards": "Won 2 Oscars. 163 wins & 164 nominations total","awardsWins": 163,"awardsNominations": 164,"oscarWins": 2,"oscarNominations": null,"releaseDate": "July 18, 2008 (United States)","languages": ["English", "Mandarin"],"countries": ["United States", "United Kingdom"],"productionCompanies": [{"name": "Warner Bros.","imdbId": "co0002663","url": "https://www.imdb.com/company/co0002663/"},{"name": "Legendary Entertainment","imdbId": "co0159111","url": "https://www.imdb.com/company/co0159111/"}],"filmingLocations": ["Chicago, Illinois, USA", "London, England, UK"],"alsoKnownAs": "","keywords": ["superhero", "dc comics", "joker", "batman", "sequel"],"totalSeasons": null,"totalEpisodes": null,"trailerUrl": "","trailerName": "","productionStatus": "","scrapedAt": "2026-03-02T12:00:00.000Z"}
Reviews output (when includeReviews is enabled)
Reviews are stored as separate dataset entries linked by imdbId:
{"imdbId": "tt0468569","parentTitle": "The Dark Knight","dataType": "reviews","reviewCount": 25,"averageReviewRating": 8.7,"reviews": [{"reviewTitle": "A masterpiece of modern cinema","reviewRating": 10,"reviewText": "Heath Ledger's performance as the Joker is...","author": "moviefan123","authorUrl": "https://www.imdb.com/user/ur12345678/","date": "18 July 2008","helpfulVotes": 1234,"totalVotes": 1456,"hasSpoiler": false}],"scrapedAt": "2026-03-02T12:00:05.000Z"}
Extraction strategies
The scraper uses three independent extraction strategies and merges results for maximum reliability:
-
JSON-LD (schema.org) — IMDb embeds structured
schema.org/Movieorschema.org/TVSeriesdata in<script type="application/ld+json">tags. This is the most standardized and reliable source for core fields like title, rating, director, cast, and description. -
NEXT_DATA — IMDb is built with Next.js and includes a
<script id="__NEXT_DATA__">tag with the server-side rendered data payload. This contains the most complete data including runtime in seconds, metacritic scores, production status, season/episode counts, countries, languages, and filming locations. -
DOM parsing — Direct Cheerio-based HTML scraping using
data-testidattributes and CSS selectors. This serves as the ultimate fallback and is most resilient to data structure changes since it reads what the user sees. Handles multiple selector strategies per field.
The merge function takes the best available value for each field using priority order: JSON-LD > NEXT_DATA > DOM.
Performance and limits
| Metric | Value |
|---|---|
| Speed | 15-40 titles/minute (with proxies) |
| Speed | 5-15 titles/minute (without proxies) |
| Max results | 5,000 titles per run |
| Reviews per title | Up to 25 |
| Concurrency | 1-10 (default: 3) |
| Retries | 1-10 per request (default: 5) |
| Bot detection | Automatic retry with header rotation |
Cost estimation (PPE pricing)
| Titles | Reviews | Estimated cost |
|---|---|---|
| 10 | No | $0.04 |
| 50 | No | $0.20 |
| 100 | Yes | $0.40 |
| 500 | No | $2.00 |
| 1,000 | Yes | $4.00 |
| 5,000 | No | $20.00 |
Plus Apify platform usage costs (compute units, proxy bandwidth).
Tips for best results
- Use proxies for runs with 100+ titles. IMDb may rate-limit after many requests from the same IP.
- Start small with 10-20 titles to verify the data you need, then scale up.
- Type filtering at the actor level is more efficient than filtering a large "all" result set.
- Reviews add time — each title with reviews enabled requires an extra page load. Only enable when you need sentiment data.
- Concurrency of 3 is a good balance between speed and reliability. Going above 5 increases the risk of rate limiting.
- Sort by votes to get the most well-known titles first — useful when you want popular content for recommendation engines.
Integrations
Export your results to any format or destination:
- JSON, CSV, Excel — download directly from the Apify dataset
- Google Sheets — automatic sync via Apify integrations
- Webhooks — trigger downstream processing when the run completes
- API — access results programmatically via the Apify API
- Zapier / Make — connect to 5,000+ apps for automated workflows
- Amazon S3, Google Cloud Storage — store large datasets in the cloud
Changelog
v1.0.0 (2026-03-02)
- Initial release
- Three-strategy extraction (JSON-LD, NEXT_DATA, DOM)
- Search, title detail, and reviews page handling
- PPE pricing at $0.004 per title
- 12 User-Agent rotation, realistic browser headers
- Support for movies, TV shows, and all content types
- Box office, awards, production company extraction
- Automatic pagination and bot detection handling
License
MIT License. Built by Sovereign AI.
Related Actors
- Google Search Scraper — Search for movies/shows
- YouTube Scraper — Trailer data
- Reddit Scraper — Movie discussions
- Twitter Scraper — Movie buzz on social