Art Institute of Chicago Scraper avatar

Art Institute of Chicago Scraper

Pricing

from $3.75 / 1,000 result items

Go to Apify Store
Art Institute of Chicago Scraper

Art Institute of Chicago Scraper

Export 131,000+ artworks from the Art Institute of Chicago catalog. Pull titles, artists, dates, mediums, dimensions, places of origin, classifications, departments, credit lines, and high-resolution IIIF image links by browse, search, classification, or department.

Pricing

from $3.75 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

ParseForge Banner

πŸ–ΌοΈ Art Institute of Chicago Scraper

πŸš€ Export the AIC catalog in seconds. Pull 131,000+ artworks across paintings, sculptures, photographs, prints, textiles, and decorative arts. No login, no API key, no manual CSV wrangling.

πŸ•’ Last updated: 2026-05-21 Β· πŸ“Š 38 fields per record Β· πŸ–ΌοΈ 131,945 artworks Β· πŸ›οΈ 16 curatorial departments Β· 🎨 30+ classifications

The Art Institute of Chicago Scraper exports the museum's public catalog and returns 38 fields per artwork, including title, artist, date, medium, dimensions, place of origin, classification, department, credit line, provenance text, and a direct IIIF high-resolution image link. The underlying dataset is one of the largest open museum catalogs in the world, maintained by AIC curators and continuously refreshed.

Coverage spans 131,945 records across 16 curatorial departments and 30+ classifications, from European old masters and Impressionist paintings to Japanese prints, African sculpture, and contemporary textiles. Records include public-domain flags and on-view status, so you can build galleries, study collections, or training sets without rights ambiguity.

🎯 Target AudienceπŸ’‘ Primary Use Cases
Art historians, museum tech teams, EdTech builders, ML researchers, gallery apps, digital archivists, OSM contributorsHigh-resolution image datasets, provenance research, classroom slide decks, ML training (vision), gallery autocomplete, public-domain image discovery

πŸ“‹ What the Art Institute of Chicago Scraper does

Two collection workflows in a single run:

  • πŸ–ΌοΈ Browse the full catalog. Walk all 131,945 records in catalog order with optional filters.
  • πŸ” Keyword search. Free-text query across titles, artists, mediums, and credit lines.
  • πŸ›οΈ Department filter. Restrict to one of 16 curatorial departments such as Modern and Contemporary Art or Photography and Media.
  • 🎨 Classification filter. Narrow to one technique (painting, sculpture, etching, lithograph, woodblock print, photograph, and more).
  • βœ… Public-domain only. Surface artworks released into the public domain for free reuse.
  • πŸͺŸ On-view only. Surface artworks currently on display in a museum gallery.

Each record includes the artist credit, dating, medium and dimensions, classification, department, provenance, credit line, and a direct IIIF image URL sized for web use.

πŸ’‘ Why it matters: access to a structured, well-maintained museum catalog powers research projects, classroom slide decks, training data for vision models, and inspiration libraries for designers. The catalog mixes public-domain works that anyone can reuse with copyrighted contemporary holdings, and this scraper exposes that distinction in a single field.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


βš™οΈ Input

InputTypeDefaultBehavior
maxItemsinteger10Records to return. Free plan caps at 10, paid plan at 1,000,000.
modestring"list"list walks the full catalog, search runs a keyword query.
querystring""Free-text search string when mode is search.
departmentIdstring""One of 16 department IDs (e.g. PC-11 Modern Art). Empty = any.
classificationstring""One classification (e.g. painting, etching, textile). Empty = any.
publicDomainOnlybooleanfalseWhen true, returns only public-domain artworks.
isOnViewbooleanfalseWhen true, returns only artworks currently on view.

Example: 50 public-domain paintings from Painting and Sculpture of Europe.

{
"maxItems": 50,
"mode": "list",
"departmentId": "PC-10",
"classification": "painting",
"publicDomainOnly": true
}

Example: 25 search results for "monet water lilies".

{
"maxItems": 25,
"mode": "search",
"query": "monet water lilies"
}

⚠️ Good to Know: the AIC catalog includes both public-domain and copyrighted works. The isPublicDomain field tells you which is which. The IIIF image URL is the standard 843-pixel-wide web rendition; for higher resolution use the IIIF service directly with the image_id path component.


πŸ“Š Output

Each artwork record contains 39 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
πŸ–ΌοΈ imageUrlstring | nullIIIF web rendition (843 px wide)
πŸ†” artworkIdinteger65133
🏷️ titlestring"Panel (Upholstery Fabric)"
πŸ‘€ artistDisplaystring | null"Yvonne Palmer Pacanovsky Bobrowicz (American, 1928-2022)"
πŸ‘€ artistTitlestring | nullShort artist label
πŸ“… dateDisplaystring | null"1951"
πŸ“… dateStart / dateEndnumber | nullDecimal year bounds
🌍 placeOfOriginstring | null"Pennsylvania"
🎨 mediumDisplaystring | null"Cotton, rayon, and polyolefin, float weave"
πŸ“ dimensionsstring | null"210.8 x 103.8 cm"
πŸ—‚οΈ classificationTitlestring | null"textile"
πŸ›οΈ departmentTitlestring | null"Textiles"
πŸ’³ creditLinestring | null"Gift of Mr. and Mrs. Joseph R. Bobrowicz"
πŸ”– mainReferenceNumberstring | nullAccession number
βœ… isPublicDomainboolean | nullfalse
πŸͺŸ isOnViewboolean | nullfalse
🎨 colorfulnessnumber | nullNumeric color metric
🎨 colorHslobject | null{h, s, l, percentage, population}
πŸ“š subjectTitles / styleTitles / themeTitlesstring[] | nullCurated tags
πŸ“œ provenanceTextstring | nullOwnership history
πŸ“– publicationHistorystring | nullPublications referencing the work
🎟️ exhibitionHistorystring | nullExhibition history text
πŸ”— artworkUrlstring"https://www.artic.edu/artworks/65133"
πŸ•’ scrapedAtISO 8601"2026-05-20T23:14:00.000Z"

πŸ“¦ Sample records


✨ Why choose this Actor

Capability
πŸ–ΌοΈMassive catalog. 131,945 artworks across paintings, sculpture, photography, prints, textiles, and decorative arts.
🎯Multi-dimensional filtering. Department, classification, search query, public-domain flag, and on-view flag combine freely.
πŸ–ΌοΈDirect IIIF image links. One field returns a ready-to-render high-resolution image URL per record.
⚑Fast. 10 records in under 10 seconds, 10,000 in a few minutes.
πŸ“šRich metadata. Provenance, publication history, exhibition history, credit line, and dating bounds included.
πŸ”Always fresh. Every run pulls the latest catalog so new accessions and corrections appear automatically.
🚫No authentication. Public catalog data. No login or token required.

πŸ“Š The AIC catalog is one of the most-cited open museum datasets in the cultural-heritage community. The combination of metadata depth and high-resolution imagery is unusual at this scale.


πŸ“ˆ How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ Art Institute of Chicago Scraper (this Actor)$5 free credit, then pay-per-use131,945 artworksLive per runmode, department, classification, public-domain, on-view⚑ 2 min
Manual catalog browsingFreePer artworkManualNone🐒 Tedious
Aggregator museum portalsFree / mixedSmaller subsetsVariableLimited⏳ Hours
Build your own crawlerEngineering timeFullCustomCustomπŸ› οΈ Days

Pick this Actor when you want a clean, filtered slice of the AIC catalog without writing parsers, handling pagination, or maintaining schemas.


πŸš€ How to use

  1. πŸ“ Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the Art Institute of Chicago Scraper page on the Apify Store.
  3. 🎯 Set input. Pick a department, classification, search query, or leave defaults for a catalog browse.
  4. πŸš€ Run it. Click Start and let the Actor collect your data.
  5. πŸ“₯ Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.


πŸ’Ό Business use cases

🏫 EdTech and digital classrooms

  • Build slide decks with high-resolution public-domain artwork
  • Power art-history flashcard apps with structured metadata
  • Generate quiz datasets keyed to artist, period, or technique
  • Localize museum content for language-learning platforms

πŸ€– ML and computer vision

  • Train fine-art classifiers on labeled, well-curated images
  • Build style-transfer or generative models on public-domain works
  • Seed multimodal models with rich text-image pairs
  • Benchmark image embedding models on art-domain queries

πŸ–ΌοΈ Galleries, apps, and visualization

  • Sync gallery autocomplete with the canonical AIC titles
  • Build interactive timelines or geographic origin maps
  • Render virtual exhibitions sorted by department or theme
  • Power discovery feeds with on-view filters

πŸ“š Research and journalism

  • Provenance investigations with structured ownership history
  • Stylistic comparisons across artist, place, and period
  • Open-data exercises around museum metadata quality
  • Citation-ready datasets for arts journalism stories

πŸ”Œ Automating Art Institute of Chicago Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟒 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • πŸ“š See the Apify documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Daily or weekly refreshes keep downstream catalogs in sync with new accessions and metadata corrections.


🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

πŸŽ“ Research and academia

  • Art-history theses with reproducible, versioned dataset pulls
  • Quantitative studies on collection composition by period or region
  • Reproducible open-access research with cited records
  • Provenance and looted-art research with structured ownership text

🎨 Personal and creative

  • Inspiration libraries for designers, illustrators, and animators
  • Mood boards keyed to medium, palette, or period
  • Side projects: art-of-the-day bots, gallery wallpapers
  • Hobbyist databases for collectors and museum-goers

🀝 Non-profit and civic

  • Cultural-heritage transparency around public-domain works
  • Accessible companion apps for museum tours and field trips
  • Investigative journalism on collection history
  • Wikimedia Commons and Wikipedia cultural-heritage uploads

πŸ§ͺ Experimentation

  • Vision-model benchmarks on a single curated catalog
  • Prompt engineering for AI image-description tasks
  • Agent pipelines that resolve titles, artists, and provenance
  • Hackathon projects around digital art and discovery

πŸ€– Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:


❓ Frequently Asked Questions

🧩 How does it work?

Configure your department, classification, search query, or public-domain filter, click Start, and the Actor pulls the AIC catalog page by page, returning one clean structured record per artwork. No browser automation, no captchas, no setup.

πŸ“ How accurate is the data?

The records mirror the museum's own catalog. Cataloging accuracy depends on AIC curatorial staff. Older records may have less complete provenance or dimensions; modern accessions tend to be fully populated.

πŸ” How often is the dataset refreshed?

The catalog is updated continuously as the museum revises records and accessions new artworks. Every run of this Actor pulls the latest data, so your dataset reflects current entries at run time.

πŸ–ΌοΈ Are the images free to reuse?

Only artworks with isPublicDomain set to true are explicitly cleared for free reuse. Other records remain under copyright; consult the museum's terms before redistribution.

⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval (hourly, daily, weekly) and keep a downstream database in sync.

The structured metadata is published openly. Image rights vary by artwork; the isPublicDomain field flags works released for free reuse. Review the museum's terms for your specific use case.

πŸ’Ό Can I use this data commercially?

Yes for the metadata and for any artwork flagged as public domain. Other images may carry copyright restrictions. You are responsible for compliance.

πŸ’³ Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you access to scheduling, higher concurrency, and larger datasets.

πŸ” What happens if a run fails or gets interrupted?

Apify automatically retries transient errors. If a run still fails, inspect the log in the Runs tab, adjust the input, and re-run. Partial datasets from failed runs are preserved.

The schema returns galleryTitle and galleryId per record, so you can post-filter after a run. Add a search query like the exhibition title for tighter matching at collection time.

πŸ†˜ What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.


πŸ”Œ Integrate with any app

Art Institute of Chicago Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications in your channels
  • Airbyte - Pipe catalog data into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh catalog data into your product backend, or alert your team in Slack.


πŸ’‘ Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.


πŸ†˜ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.