# IMDb Scraper — Movies, Ratings & Top Charts (`cryptosignals/imdb-scraper`) Actor

Scrapes IMDb movie and TV show data. Returns titles, ratings, cast, plot summaries, release dates, box office figures, and review scores.

- **URL**: https://apify.com/cryptosignals/imdb-scraper.md
- **Developed by:** [CryptoSignals Agent](https://apify.com/cryptosignals) (community)
- **Categories:** Developer tools, Lead generation, E-commerce
- **Stats:** 3 total users, 1 monthly users, 88.5% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## IMDb Scraper — Movies, TV Shows, Ratings & Charts

Scrape IMDb for comprehensive movie and TV show data. Search for titles, get detailed information including ratings, cast, plot, box office numbers, or fetch the Top 250 and Most Popular charts — all from the world's largest movie database.

### Why scrape IMDb?

IMDb is the definitive source for entertainment data with **over 10 million titles** and **83 million registered users**. Whether you're building a recommendation engine, conducting market research, or analyzing entertainment trends, IMDb data is essential:

- **Movie research & analytics** — Track ratings, box office performance, and audience reception over time
- **Content recommendation systems** — Build recommendation engines using ratings, genres, and cast data
- **Entertainment industry intelligence** — Monitor what's trending, what's climbing the charts, and audience sentiment
- **Academic research** — Film studies, cultural analysis, and media research datasets
- **Data journalism** — Stories about the film industry backed by comprehensive data
- **Portfolio building** — Curate and showcase movie collections with rich metadata
- **Market analysis** — Compare box office performance, budget-to-revenue ratios, and genre trends
- **Watchlist curation** — Build smart watchlists based on ratings, genres, directors, or cast

### Features

#### 1. Search (`action: "search"`)

Search IMDb's catalog of movies and TV shows. Returns structured results with ratings, genres, and poster images.

**Input:**
```json
{
    "action": "search",
    "query": "Interstellar",
    "type": "movie",
    "maxItems": 10
}
````

**Output fields:** `imdbId`, `title`, `type`, `year`, `rating`, `ratingCount`, `genres`, `plot`, `poster`, `url`

#### 2. Title Details (`action: "title"`)

Get comprehensive details for any movie or TV show. Accepts either an IMDb URL or title ID.

**Input (by URL):**

```json
{
    "action": "title",
    "url": "https://www.imdb.com/title/tt0816692/"
}
```

**Input (by ID):**

```json
{
    "action": "title",
    "url": "tt0816692"
}
```

**Output fields:** `imdbId`, `title`, `type`, `year`, `datePublished`, `contentRating`, `rating`, `ratingCount`, `plot`, `genres`, `director`, `creator`, `cast` (top 5), `keywords`, `poster`, `url`, `runtimeMinutes`, `languages`, `budget`, `boxOffice`

**Example output:**

```json
{
    "imdbId": "tt0816692",
    "title": "Interstellar",
    "type": "Movie",
    "year": "2014",
    "datePublished": "2014-11-07",
    "contentRating": "PG-13",
    "rating": 8.7,
    "ratingCount": 2497887,
    "plot": "When Earth becomes uninhabitable in the future, a farmer and ex-NASA pilot is tasked to pilot a spacecraft to find a new planet for humans.",
    "genres": ["Adventure", "Drama", "Sci-Fi"],
    "director": ["Christopher Nolan"],
    "cast": ["Matthew McConaughey", "Anne Hathaway", "Jessica Chastain", "Mackenzie Foy", "Ellen Burstyn"],
    "runtimeMinutes": 169,
    "languages": ["English"],
    "budget": "$165,000,000",
    "boxOffice": "$677,463,813",
    "poster": "https://m.media-amazon.com/images/M/...",
    "url": "https://www.imdb.com/title/tt0816692/"
}
```

#### 3. Top Charts (`action: "top-chart"`)

Fetch IMDb's curated charts — the legendary Top 250 or the Most Popular movies right now.

**Input:**

```json
{
    "action": "top-chart",
    "chart": "top250",
    "maxItems": 50
}
```

**Output fields:** `rank`, `imdbId`, `title`, `type`, `year`, `rating`, `ratingCount`, `genres`, `poster`, `url`

### How it works

This actor uses IMDb's **embedded structured data** for maximum reliability:

1. **JSON-LD** (`application/ld+json`) — IMDb embeds Schema.org structured data in every title page containing name, description, ratings, cast, director, and more. This is the same data Google uses for rich search results.

2. **`__NEXT_DATA__`** — IMDb's Next.js frontend embeds the full page data as JSON, which we extract for search results, chart listings, and additional title details (budget, box office, languages, runtime).

This approach is significantly more reliable than HTML parsing because structured data formats rarely change even when the visual design is updated.

### Use cases

#### Entertainment data pipelines

Build automated pipelines that track new releases, monitor rating changes, or aggregate box office data across hundreds of titles.

#### Movie recommendation APIs

Power your recommendation engine with rich IMDb metadata — combine ratings, genres, cast overlap, and director filmographies to suggest the perfect next watch.

#### Film industry dashboards

Create dashboards showing trending movies, genre performance over time, or director/actor career trajectories based on IMDb ratings.

#### Academic & research datasets

Generate clean, structured datasets for film studies, cultural analysis, NLP training data (plot descriptions), or media consumption research.

#### Content aggregation

Build entertainment portals, review aggregators, or streaming guide apps with comprehensive movie metadata from IMDb.

#### Competitive analysis

Track how movies perform relative to their budgets, compare franchise performance, or analyze seasonal release patterns.

### Input schema

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `action` | string | Yes | `"search"` | Action to perform: `search`, `title`, or `top-chart` |
| `query` | string | For search | — | Search query string |
| `type` | string | No | `"movie"` | Title type filter: `movie`, `tv`, or `all` |
| `url` | string | For title | — | IMDb URL or title ID (e.g., `tt0816692`) |
| `chart` | string | No | `"top250"` | Chart to fetch: `top250` or `popular` |
| `maxItems` | integer | No | `25` | Maximum results (1-250) |

### Output format

All results are pushed to the default dataset. Each item is a JSON object with fields specific to the action used. See the feature sections above for detailed field descriptions.

### Rate limiting & best practices

- The actor uses a single HTTP request per search/title/chart operation — no crawling or spidering
- IMDb pages are fetched with standard browser-like headers
- For bulk operations, consider adding delays between runs to be respectful of IMDb's servers
- Results are extracted from structured data (JSON-LD and `__NEXT_DATA__`), not screen-scraped HTML

### Technical details

- **Language:** Python 3
- **Dependencies:** `httpx` (async HTTP), `beautifulsoup4` (HTML parsing), `apify` (Actor SDK)
- **Data sources:** JSON-LD structured data, Next.js `__NEXT_DATA__`
- **No browser required** — pure HTTP requests, no Playwright or Puppeteer needed
- **Fast execution** — typically completes in 1-3 seconds per request

### Example integrations

#### Python

```python
from apify_client import ApifyClient

client = ApifyClient("your_api_token")

## Search for movies
run = client.actor("cryptosignals/imdb-scraper").call(run_input={
    "action": "search",
    "query": "Christopher Nolan",
    "type": "movie",
    "maxItems": 10,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['title']} ({item['year']}) - {item['rating']}/10")
```

#### JavaScript

```javascript
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'your_api_token' });

const run = await client.actor('cryptosignals/imdb-scraper').call({
    action: 'title',
    url: 'tt0816692',
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0]);
```

#### cURL (API)

```bash
curl "https://api.apify.com/v2/acts/cryptosignals~imdb-scraper/runs?token=YOUR_TOKEN" \
  -X POST \
  -d '{"action": "top-chart", "chart": "top250", "maxItems": 10}' \
  -H 'Content-Type: application/json'
```

### Changelog

- **v0.1** — Initial release: search, title details, and top chart support

### Using proxies

IMDb (owned by Amazon) applies sophisticated bot detection that blocks datacenter IPs and rate-limits automated requests, returning CAPTCHAs or 503 errors during bulk scraping. Residential proxies use real ISP addresses that IMDb's detection systems treat as normal browser traffic. [ThorData](https://thordata.partnerstack.com/partner/0a0x4nzjr3ky) provides 200M+ residential IPs that reliably bypass Amazon's anti-bot infrastructure.

# Actor input Schema

## `action` (type: `string`):

What to do: search for titles, get details for a specific title, or fetch a top chart.

## `query` (type: `string`):

Search term (for 'search' action). Example: 'Interstellar', 'Breaking Bad'.

## `type` (type: `string`):

Filter search results by type.

## `url` (type: `string`):

IMDb title URL or ID (for 'title' action). Example: 'https://www.imdb.com/title/tt0816692/' or 'tt0816692'.

## `chart` (type: `string`):

Which chart to fetch (for 'top-chart' action).

## `maxItems` (type: `integer`):

Maximum number of results to return.

## Actor input object example

```json
{
  "action": "search",
  "query": "Interstellar",
  "type": "movie",
  "chart": "top250",
  "maxItems": 5
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "action": "search",
    "query": "Interstellar",
    "type": "movie",
    "url": "",
    "chart": "top250",
    "maxItems": 5
};

// Run the Actor and wait for it to finish
const run = await client.actor("cryptosignals/imdb-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "action": "search",
    "query": "Interstellar",
    "type": "movie",
    "url": "",
    "chart": "top250",
    "maxItems": 5,
}

# Run the Actor and wait for it to finish
run = client.actor("cryptosignals/imdb-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "action": "search",
  "query": "Interstellar",
  "type": "movie",
  "url": "",
  "chart": "top250",
  "maxItems": 5
}' |
apify call cryptosignals/imdb-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=cryptosignals/imdb-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "IMDb Scraper — Movies, Ratings & Top Charts",
        "description": "Scrapes IMDb movie and TV show data. Returns titles, ratings, cast, plot summaries, release dates, box office figures, and review scores.",
        "version": "0.1",
        "x-build-id": "WUlCzBwlKWlzfURT1"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/cryptosignals~imdb-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-cryptosignals-imdb-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/cryptosignals~imdb-scraper/runs": {
            "post": {
                "operationId": "runs-sync-cryptosignals-imdb-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/cryptosignals~imdb-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-cryptosignals-imdb-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "action"
                ],
                "properties": {
                    "action": {
                        "title": "Action",
                        "enum": [
                            "search",
                            "title",
                            "top-chart"
                        ],
                        "type": "string",
                        "description": "What to do: search for titles, get details for a specific title, or fetch a top chart.",
                        "default": "search"
                    },
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search term (for 'search' action). Example: 'Interstellar', 'Breaking Bad'.",
                        "default": "Interstellar"
                    },
                    "type": {
                        "title": "Title Type",
                        "enum": [
                            "movie",
                            "tv",
                            "all"
                        ],
                        "type": "string",
                        "description": "Filter search results by type.",
                        "default": "movie"
                    },
                    "url": {
                        "title": "IMDb URL or ID",
                        "type": "string",
                        "description": "IMDb title URL or ID (for 'title' action). Example: 'https://www.imdb.com/title/tt0816692/' or 'tt0816692'."
                    },
                    "chart": {
                        "title": "Chart",
                        "enum": [
                            "top250",
                            "popular"
                        ],
                        "type": "string",
                        "description": "Which chart to fetch (for 'top-chart' action).",
                        "default": "top250"
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 250,
                        "type": "integer",
                        "description": "Maximum number of results to return.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
