Art Institute of Chicago Artworks Scraper
Pricing
Pay per event
Art Institute of Chicago Artworks Scraper
Scrapes the Art Institute of Chicago open API — 131k artworks with metadata, dominant colour analysis (HSL), style/subject taxonomy, and IIIF image URLs. Supports full-walk, keyword search, date-range, type/department filters, and by-ID lookups. No API key required.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
12 hours ago
Last modified
Categories
Share
Scrape artwork records from the Art Institute of Chicago's open REST API. Returns full metadata for 131k+ artworks including title, artist, department, medium, dimensions, classification, style/subject taxonomy, dominant colour analysis (HSL), and IIIF image URLs. No API key required — the AIC API is fully public with CC0 metadata.
What you get
Each scraped record contains:
| Field | Description |
|---|---|
id | Unique AIC artwork ID |
title | Artwork title |
artist_display | Full artist display string (name, nationality, dates) |
artist_title | Primary artist name (short form) |
date_display | Human-readable date string (e.g. "1906" or "c. 1890–1900") |
date_start | Earliest possible year for the artwork |
date_end | Latest possible year for the artwork |
medium_display | Materials and technique description |
artwork_type_title | Artwork type (Painting, Drawing, Print, Sculpture, etc.) |
department_title | Museum department holding the artwork |
classification_titles | Classification tags array (e.g. "oil on canvas", "european painting") |
dimensions | Physical dimensions string |
place_of_origin | Country or region where the artwork was created |
is_public_domain | True if the artwork image is in the public domain (CC0) |
credit_line | Donor/acquisition credit line |
gallery_title | Gallery currently displaying the artwork (null if in storage) |
color_h | Dominant colour hue (HSL, 0–360) |
color_s | Dominant colour saturation (HSL, 0–100) |
color_l | Dominant colour lightness (HSL, 0–100) |
subject_titles | Subject/theme tags array (e.g. "landscapes", "water lilies") |
style_titles | Art style tags array (e.g. "Impressionism", "20th Century") |
image_id | IIIF image identifier (UUID) |
iiif_image_url | Full IIIF image URL (843px wide JPEG) — null if no image |
aic_artwork_url | Canonical artwork page URL on artic.edu |
Modes
Walk mode (default)
Paginate through the full collection of 131k+ artworks:
{"mode": "walk","maxItems": 1000}
Omit maxItems to walk the entire collection.
Add publicDomainOnly: true to restrict results to CC0 artworks (image downloads safe):
{"mode": "walk","publicDomainOnly": true,"maxItems": 500}
Search mode
Full-text keyword search across all artwork metadata:
{"mode": "search","query": "impressionism","maxItems": 100}
Filter mode
Filter by artwork type, department, or date range:
{"mode": "filter","artworkTypeTitle": "Painting","dateFrom": 1880,"dateTo": 1920,"maxItems": 200}
Available filters:
artworkTypeTitle— artwork type string (e.g. "Painting", "Drawing", "Print", "Sculpture", "Photograph")departmentTitle— department name string (e.g. "Photography and Media", "Painting and Sculpture of Europe")dateFrom— filter artworks withdate_start>= this yeardateTo— filter artworks withdate_start<= this year
Combine multiple filters in one run.
By-ID mode
Fetch specific artworks by their AIC numeric ID:
{"mode": "by_ids","artworkIds": [16568, 27992, 28560],"maxItems": 10}
IDs can be found in AIC URLs: https://www.artic.edu/artworks/16568/water-lilies.
Use cases
- Art analytics datasets — combine with the Met Museum actor for a comprehensive GLAM collection
- Colour research — the
color_h/s/lfields enable colour-palette analysis across 131k artworks - Style/subject taxonomy —
style_titlesandsubject_titlesprovide ready-made ML training labels - Public-domain image pipelines — filter
is_public_domain: trueand useiiif_image_urlfor CC0 downloads - Museum collection monitoring — run incrementally to detect newly added records
Rate limits
The AIC API is fully open — no authentication required. The actor runs at ~60 requests/minute with a 1-second delay between pages, well within the API's polite-use advisory. No proxy is needed.