# Cleveland Museum of Art Scraper — Open Access Artworks (`compute-edge/cleveland-museum-art-scraper`) Actor

Extract artworks from the Cleveland Museum of Art Open Access API. Get title, creator, creation date, type, department, culture, technique, image URLs, and web links. Search by keyword, department, or type. Clean JSON for art research and catalogs.

- **URL**: https://apify.com/compute-edge/cleveland-museum-art-scraper.md
- **Developed by:** [Compute Edge](https://apify.com/compute-edge) (community)
- **Categories:** Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Cleveland Museum of Art Scraper — Open Access Artworks

Extract structured artwork data from the **Cleveland Museum of Art Open Access API** — one of the most comprehensive public collections of fine art data available anywhere. This Actor queries the museum's free, publicly accessible REST API and returns clean, structured records ready for research, visualization, AI enrichment, or RAG pipelines.

The Cleveland Museum of Art holds over 68,000 open-access artworks spanning 5,000+ years of human creativity, from ancient Egyptian artifacts to contemporary American paintings. All records are released under **CC0 (no rights reserved)**, making this dataset ideal for academic research, machine learning training datasets, art history analysis, and commercial applications.

### Key Features

- **68,000+ open-access artwork records** — One of the largest freely accessible museum datasets in the world
- **Keyword search** — Query by artist name, medium, title, or any text found in artwork metadata
- **Department and type filtering** — Narrow to Paintings, Prints, Photography, Sculpture, Drawing, and more
- **Image filter** — Optionally restrict results to artworks with available images
- **Paginated extraction** — Handles the full collection in batches of 100 for reliable, complete extraction
- **No authentication required** — Cleveland Museum of Art's public Open Access API, no API key needed
- **CC0 licensed data** — All extracted data is released under Creative Commons Zero (public domain dedication)
- **Clean, structured output** — Normalized JSON with 11 fields per artwork, missing values coerced to empty strings

### Output Data Fields

| Field | Type | Description |
|-------|------|-------------|
| `id` | integer | Internal museum identifier |
| `accessionNumber` | string | Museum accession number (e.g., "1915.534") |
| `title` | string | Artwork title |
| `creationDate` | string | Date or date range of creation (e.g., "c. 1765", "1880–1885") |
| `type` | string | Artwork type (e.g., "Painting", "Print", "Sculpture") |
| `department` | string | Museum department (e.g., "American Painting and Sculpture") |
| `culture` | string | Cultural origin(s), comma-separated (e.g., "French, Italian") |
| `technique` | string | Medium and technique (e.g., "oil on canvas", "woodblock print") |
| `creator` | string | Primary creator description (e.g., "Vincent van Gogh (Dutch, 1853–1890)") |
| `imageUrl` | string | Web-resolution image URL (empty if no image available) |
| `webUrl` | string | Direct URL to the artwork page on clevelandart.org |

### How to Scrape Cleveland Museum of Art Data

Follow these steps to extract artwork records from the Cleveland Museum of Art Open Access collection:

1. **Open the Actor** — Find "Cleveland Museum of Art Scraper" in the Apify Store and click **Try for free**.

2. **Configure your search** — In the input form, set your filters:
   - Leave **Search Query** empty to retrieve all artworks
   - Set **Department** to "Paintings" to target only paintings
   - Set **Type** to "Painting", "Print", "Sculpture", etc. for type-level filtering
   - Enable **Has Image Only** if you need artworks with downloadable images
   - Set **Max Results** to control how many records to extract (default: 100, max: 5000)

3. **Run the Actor** — Click **Start**. The Actor connects directly to the Cleveland Museum's public API. No authentication, no proxies required.

4. **Review results** — When the run completes, click **Results** to browse extracted artwork records in the dataset viewer.

5. **Export your data** — Download as JSON, CSV, XLSX, or XML. Use the Apify API to integrate directly with your pipeline.

#### Input Example

```json
{
  "q": "portrait",
  "department": "American Painting and Sculpture",
  "type": "Painting",
  "hasImage": true,
  "maxResults": 500
}
````

#### Output Example

```json
{
  "id": 94979,
  "accessionNumber": "1915.534",
  "title": "Nathaniel Hurd",
  "creationDate": "c. 1765",
  "type": "Painting",
  "department": "American Painting and Sculpture",
  "culture": "America",
  "technique": "oil on canvas",
  "creator": "John Singleton Copley (American, 1738–1815)",
  "imageUrl": "https://openaccess-api.clevelandart.org/api/artworks/94979/images/web",
  "webUrl": "https://clevelandart.org/art/1915.534"
}
```

### Pricing

This Actor uses Apify's per-result pricing model. Costs are low because the Cleveland Museum of Art API is fast and requires no browser rendering or proxy infrastructure.

| Volume | Estimated Cost |
|--------|---------------|
| 100 records | < $0.01 |
| 1,000 records | ~$0.05 |
| 5,000 records | ~$0.25 |

You also pay for Apify compute time (typically under $0.01 for a standard run).

### Other Scrapers

Looking for more cultural institution and open access data? Check out these related Actors:

- **[CISA Known Exploited Vulnerabilities Scraper](https://apify.com/seatsignal/cisa-kev-scraper)** — Extract CVE threat intelligence data from CISA's KEV catalog
- Browse the full [seatsignal collection](https://apify.com/seatsignal) for more specialized data extractors

### FAQ

**Q: Is scraping the Cleveland Museum of Art legal?**
A: Yes. The Cleveland Museum of Art explicitly makes this data available via their Open Access API under CC0 (Creative Commons Zero). The data is intended for public use with no restrictions.

**Q: Do I need an API key or account?**
A: No. The Cleveland Museum of Art Open Access API is completely free and requires no authentication or API key.

**Q: How many artworks are available?**
A: Over 68,000 artworks as of 2026. The total grows as the museum digitizes additional works.

**Q: Can I get artworks currently on view only?**
A: The API supports a `currently_on_view` filter. Contact support if you need this feature enabled in the Actor.

**Q: What image sizes are available?**
A: The API provides web (small), print (medium), and full (large) resolution images. This Actor returns the web-resolution URL by default.

**Q: Why are some `imageUrl` fields empty?**
A: Not all artworks have digitized images. Enable the **Has Image Only** filter to exclude records without images.

### Legal Disclaimer

This Actor accesses only publicly available data from the Cleveland Museum of Art's official Open Access API (`openaccess-api.clevelandart.org`). All data extracted is released under the CC0 Public Domain Dedication by the Cleveland Museum of Art. This Actor does not bypass any authentication, circumvent any access controls, or collect personal data. Use of extracted data is subject to Cleveland Museum of Art's Open Access Policy. The Actor author is not affiliated with the Cleveland Museum of Art. For support, contact the Apify Store listing.

# Actor input Schema

## `q` (type: `string`):

Search artworks by keyword (e.g., 'portrait', 'landscape', 'Van Gogh'). Leave empty to retrieve all artworks.

## `department` (type: `string`):

Filter by museum department (e.g., 'Paintings', 'Prints', 'Photography', 'American Painting and Sculpture'). Leave empty for all departments.

## `type` (type: `string`):

Filter by artwork type (e.g., 'Painting', 'Drawing', 'Sculpture', 'Print', 'Photograph'). Leave empty for all types.

## `hasImage` (type: `boolean`):

If enabled, only return artworks that have an associated image.

## `maxResults` (type: `integer`):

Maximum number of artwork records to return. Set to 0 for unlimited (up to 68,000+ artworks).

## Actor input object example

```json
{
  "q": "",
  "department": "",
  "type": "",
  "hasImage": false,
  "maxResults": 100
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("compute-edge/cleveland-museum-art-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("compute-edge/cleveland-museum-art-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call compute-edge/cleveland-museum-art-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=compute-edge/cleveland-museum-art-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Cleveland Museum of Art Scraper — Open Access Artworks",
        "description": "Extract artworks from the Cleveland Museum of Art Open Access API. Get title, creator, creation date, type, department, culture, technique, image URLs, and web links. Search by keyword, department, or type. Clean JSON for art research and catalogs.",
        "version": "0.1",
        "x-build-id": "Evkz19X8MlrnCA4KQ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/compute-edge~cleveland-museum-art-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-compute-edge-cleveland-museum-art-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/compute-edge~cleveland-museum-art-scraper/runs": {
            "post": {
                "operationId": "runs-sync-compute-edge-cleveland-museum-art-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/compute-edge~cleveland-museum-art-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-compute-edge-cleveland-museum-art-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "q": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search artworks by keyword (e.g., 'portrait', 'landscape', 'Van Gogh'). Leave empty to retrieve all artworks.",
                        "default": ""
                    },
                    "department": {
                        "title": "Department",
                        "type": "string",
                        "description": "Filter by museum department (e.g., 'Paintings', 'Prints', 'Photography', 'American Painting and Sculpture'). Leave empty for all departments.",
                        "default": ""
                    },
                    "type": {
                        "title": "Type",
                        "type": "string",
                        "description": "Filter by artwork type (e.g., 'Painting', 'Drawing', 'Sculpture', 'Print', 'Photograph'). Leave empty for all types.",
                        "default": ""
                    },
                    "hasImage": {
                        "title": "Has Image Only",
                        "type": "boolean",
                        "description": "If enabled, only return artworks that have an associated image.",
                        "default": false
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum number of artwork records to return. Set to 0 for unlimited (up to 68,000+ artworks).",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
