# Met Museum Open Access Art Scraper (`jungle_synthesizer/met-museum-open-access-art-scraper`) Actor

Scrape The Met's open-access collection API. Returns full metadata for 490k+ objects: title, artist, department, medium, dimensions, classification, culture, credit line, AAT tags, and CC0 image URLs. Modes: keyword search, department walk, incremental, and direct object-ID lookup.

- **URL**: https://apify.com/jungle\_synthesizer/met-museum-open-access-art-scraper.md
- **Developed by:** [BowTiedRaccoon](https://apify.com/jungle_synthesizer) (community)
- **Categories:** AI, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Met Museum Open Access Art Scraper

Scrape artwork records from The Metropolitan Museum of Art's open-access collection API. Returns full metadata for 490k+ objects including title, artist, department, medium, dimensions, classification, culture, period, credit line, AAT subject tags, and CC0 image URLs.

### What you get

Each scraped record contains:

| Field | Description |
|-------|-------------|
| `object_id` | Numeric Met Museum object ID |
| `accession_number` | Museum accession number |
| `accession_year` | Year the object was accessioned |
| `is_public_domain` | CC0 licence flag |
| `is_highlight` | Met highlight designation |
| `is_on_view` | Currently on display |
| `title` | Object title |
| `artist_display_name` | Artist name |
| `artist_display_bio` | Artist biographical note |
| `artist_nationality` | Artist nationality |
| `artist_begin_date` | Artist birth year |
| `artist_end_date` | Artist death year |
| `artist_wikidata_url` | Artist Wikidata URL |
| `artist_ulan_url` | Artist Getty ULAN URL |
| `object_date` | Creation date (human-readable) |
| `object_begin_date` | Earliest creation year (integer) |
| `object_end_date` | Latest creation year (integer) |
| `medium` | Materials (e.g. "Oil on canvas") |
| `dimensions` | Physical dimensions |
| `classification` | Object type (e.g. "Paintings", "Prints") |
| `department` | Museum department |
| `culture` | Culture of origin |
| `period` | Historical period |
| `dynasty` | Dynasty (for antiquities) |
| `credit_line` | Acquisition credit line |
| `gallery_number` | Gallery number |
| `object_name` | Object type (e.g. "Painting", "Vase") |
| `primary_image_url` | Full-resolution CC0 image URL |
| `primary_image_small_url` | Web-size CC0 image URL |
| `additional_image_urls` | Pipe-separated additional image URLs |
| `tags` | Pipe-separated AAT subject tags |
| `object_url` | Met collection page URL |
| `object_wikidata_url` | Object Wikidata URL |
| `metadata_date` | Record last updated date |
| `repository` | Repository name |

### Modes

#### Search mode (default)

Discover objects by keyword query with optional filters:

```json
{
  "mode": "search",
  "query": "van gogh",
  "hasImages": true,
  "departmentId": 11,
  "dateBegin": 1880,
  "dateEnd": 1920,
  "maxItems": 100
}
````

Filters available in search mode: `departmentId`, `hasImages`, `isHighlight`, `isOnView`, `dateBegin`, `dateEnd`, `geoLocation`, `medium`.

#### Walk mode

Iterate the full collection or a department without a keyword:

```json
{
  "mode": "walk",
  "departmentId": 11,
  "maxItems": 500
}
```

Omit `departmentId` to walk the entire 490k+ object collection.

#### Incremental mode

Fetch only objects whose metadata was updated on or after a date:

```json
{
  "mode": "incremental",
  "metadataDate": "2026-01-01",
  "maxItems": 1000
}
```

Use this for scheduled runs that pick up newly added or corrected records.

#### By-ID mode

Fetch specific objects by their Met object ID:

```json
{
  "mode": "by_ids",
  "objectIds": ["436535", "436529", "436944"],
  "maxItems": 50
}
```

### Department IDs

| ID | Department |
|----|-----------|
| 1 | American Wing |
| 3 | Ancient West Asian Art |
| 4 | Arms and Armor |
| 5 | Arts of Africa, Oceania, and the Americas |
| 6 | Asian Art |
| 7 | The Cloisters |
| 8 | The Costume Institute |
| 9 | Drawings and Prints |
| 10 | Egyptian Art |
| 11 | European Paintings |
| 12 | European Sculpture and Decorative Arts |
| 13 | Greek and Roman Art |
| 14 | Islamic Art |
| 15 | The Robert Lehman Collection |
| 16 | The Libraries |
| 17 | Medieval Art |
| 18 | Musical Instruments |
| 19 | Photographs |
| 21 | Modern Art |

### Notes on image URLs

Image URLs point to the Met's CDN (`images.metmuseum.org`) and are served under the CC0 licence for public-domain objects. This actor exports URL strings only — it does not download or rehost pixel data.

### Data source

Data is sourced from the Metropolitan Museum of Art's public collection API at `collectionapi.metmuseum.org`. No authentication is required. The Met requests polite use — this actor operates well below the advised 80 req/sec cap.

# Actor input Schema

## `sp_intended_usage` (type: `string`):

Please describe how you plan to use the data extracted by this crawler.

## `sp_improvement_suggestions` (type: `string`):

Provide any feedback or suggestions for improvements.

## `sp_contact` (type: `string`):

Provide your email address so we can get in touch with you.

## `mode` (type: `string`):

How to discover objects: "search" (keyword + filters), "walk" (full collection or department), "incremental" (changed since a date), or "by\_ids" (explicit object ID list).

## `query` (type: `string`):

Keyword query string (required when mode=search). Examples: "van gogh", "impressionism", "japanese woodblock".

## `departmentId` (type: `integer`):

Filter by department. Paintings=11, Drawings & Prints=9, European Sculpture=12, Modern Art=21, Greek & Roman=13, Egyptian Art=10, Asian Art=6, American Wing=1. Leave blank for all.

## `hasImages` (type: `boolean`):

When true, only return objects that have a primary image (applies to search and walk modes).

## `isHighlight` (type: `boolean`):

When true, only return objects designated as Met highlights (applies to search and walk modes).

## `isOnView` (type: `boolean`):

When true, only return objects currently on view at the museum.

## `dateBegin` (type: `integer`):

Earliest object creation year (search mode only). Example: 1800.

## `dateEnd` (type: `integer`):

Latest object creation year (search mode only). Example: 1900.

## `geoLocation` (type: `string`):

Filter by geographic location (search mode only). Example: "France", "Japan", "Egypt".

## `medium` (type: `string`):

Filter by medium (search mode only). Example: "oil on canvas", "watercolor", "woodblock".

## `metadataDate` (type: `string`):

Return only objects updated on or after this date (mode=incremental). Format: YYYY-MM-DD. Example: 2026-01-01.

## `objectIds` (type: `array`):

Explicit list of Met object IDs to fetch (mode=by\_ids). Example: \["436535", "436529"].

## `maxItems` (type: `integer`):

Maximum number of artwork records to return.

## Actor input object example

```json
{
  "sp_intended_usage": "Describe your intended use...",
  "sp_improvement_suggestions": "Share your suggestions here...",
  "sp_contact": "Share your email here...",
  "mode": "search",
  "query": "van gogh",
  "hasImages": false,
  "isHighlight": false,
  "isOnView": false,
  "objectIds": [
    "436535",
    "436529"
  ],
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "sp_intended_usage": "Describe your intended use...",
    "sp_improvement_suggestions": "Share your suggestions here...",
    "sp_contact": "Share your email here...",
    "mode": "search",
    "query": "van gogh",
    "objectIds": [
        "436535",
        "436529"
    ],
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("jungle_synthesizer/met-museum-open-access-art-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "sp_intended_usage": "Describe your intended use...",
    "sp_improvement_suggestions": "Share your suggestions here...",
    "sp_contact": "Share your email here...",
    "mode": "search",
    "query": "van gogh",
    "objectIds": [
        "436535",
        "436529",
    ],
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("jungle_synthesizer/met-museum-open-access-art-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "sp_intended_usage": "Describe your intended use...",
  "sp_improvement_suggestions": "Share your suggestions here...",
  "sp_contact": "Share your email here...",
  "mode": "search",
  "query": "van gogh",
  "objectIds": [
    "436535",
    "436529"
  ],
  "maxItems": 10
}' |
apify call jungle_synthesizer/met-museum-open-access-art-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=jungle_synthesizer/met-museum-open-access-art-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Met Museum Open Access Art Scraper",
        "description": "Scrape The Met's open-access collection API. Returns full metadata for 490k+ objects: title, artist, department, medium, dimensions, classification, culture, credit line, AAT tags, and CC0 image URLs. Modes: keyword search, department walk, incremental, and direct object-ID lookup.",
        "version": "0.1",
        "x-build-id": "Kh91dU30YaK08L44J"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/jungle_synthesizer~met-museum-open-access-art-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-jungle_synthesizer-met-museum-open-access-art-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/jungle_synthesizer~met-museum-open-access-art-scraper/runs": {
            "post": {
                "operationId": "runs-sync-jungle_synthesizer-met-museum-open-access-art-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/jungle_synthesizer~met-museum-open-access-art-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-jungle_synthesizer-met-museum-open-access-art-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "maxItems"
                ],
                "properties": {
                    "sp_intended_usage": {
                        "title": "What is the intended usage of this data?",
                        "minLength": 1,
                        "type": "string",
                        "description": "Please describe how you plan to use the data extracted by this crawler."
                    },
                    "sp_improvement_suggestions": {
                        "title": "How can we improve this crawler for you?",
                        "minLength": 1,
                        "type": "string",
                        "description": "Provide any feedback or suggestions for improvements."
                    },
                    "sp_contact": {
                        "title": "Contact Email",
                        "minLength": 1,
                        "type": "string",
                        "description": "Provide your email address so we can get in touch with you."
                    },
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "search",
                            "walk",
                            "incremental",
                            "by_ids"
                        ],
                        "type": "string",
                        "description": "How to discover objects: \"search\" (keyword + filters), \"walk\" (full collection or department), \"incremental\" (changed since a date), or \"by_ids\" (explicit object ID list).",
                        "default": "search"
                    },
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Keyword query string (required when mode=search). Examples: \"van gogh\", \"impressionism\", \"japanese woodblock\"."
                    },
                    "departmentId": {
                        "title": "Department ID",
                        "type": "integer",
                        "description": "Filter by department. Paintings=11, Drawings & Prints=9, European Sculpture=12, Modern Art=21, Greek & Roman=13, Egyptian Art=10, Asian Art=6, American Wing=1. Leave blank for all."
                    },
                    "hasImages": {
                        "title": "Has Images Only",
                        "type": "boolean",
                        "description": "When true, only return objects that have a primary image (applies to search and walk modes).",
                        "default": false
                    },
                    "isHighlight": {
                        "title": "Highlights Only",
                        "type": "boolean",
                        "description": "When true, only return objects designated as Met highlights (applies to search and walk modes).",
                        "default": false
                    },
                    "isOnView": {
                        "title": "On View Only",
                        "type": "boolean",
                        "description": "When true, only return objects currently on view at the museum.",
                        "default": false
                    },
                    "dateBegin": {
                        "title": "Date Begin",
                        "type": "integer",
                        "description": "Earliest object creation year (search mode only). Example: 1800."
                    },
                    "dateEnd": {
                        "title": "Date End",
                        "type": "integer",
                        "description": "Latest object creation year (search mode only). Example: 1900."
                    },
                    "geoLocation": {
                        "title": "Geographic Location",
                        "type": "string",
                        "description": "Filter by geographic location (search mode only). Example: \"France\", \"Japan\", \"Egypt\"."
                    },
                    "medium": {
                        "title": "Medium",
                        "type": "string",
                        "description": "Filter by medium (search mode only). Example: \"oil on canvas\", \"watercolor\", \"woodblock\"."
                    },
                    "metadataDate": {
                        "title": "Metadata Date (Incremental)",
                        "type": "string",
                        "description": "Return only objects updated on or after this date (mode=incremental). Format: YYYY-MM-DD. Example: 2026-01-01."
                    },
                    "objectIds": {
                        "title": "Object IDs (By-ID mode)",
                        "type": "array",
                        "description": "Explicit list of Met object IDs to fetch (mode=by_ids). Example: [\"436535\", \"436529\"].",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "type": "integer",
                        "description": "Maximum number of artwork records to return.",
                        "default": 10
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
