IMSLP Public Domain Sheet Music Scraper
Pricing
Pay per event
IMSLP Public Domain Sheet Music Scraper
Scrape the full IMSLP public-domain score catalog — 230k+ works across 24k composers, with file URLs, copyright tags, and work metadata via the IMSLP worklist API and MediaWiki API.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
10 days ago
Last modified
Categories
Share
Walk the full IMSLP catalog and pull structured data on 230,000+ musical works, 24,000+ composers, and their associated score files — all public-domain by construction.
IMSLP has two public APIs. This scraper uses both. The worklist API delivers the complete work index at 1,000 records per page. The MediaWiki API fills in per-work details: key, genre, instrumentation, composition year, and the file manifest with direct PDF download links. You can run fast (worklist-only, no detail calls) or complete (full enrichment). Both modes respect the site's request etiquette.
What You Get
Each record covers one musical work.
| Field | Type | Description |
|---|---|---|
work_id | string | IMSLP/MediaWiki page ID |
work_title | string | Work title as listed on IMSLP |
composer | string | Composer full name |
composer_slug | string | IMSLP category identifier |
opus_catalogue | string | Op., BWV, K., or other catalogue number |
genre | string | Piece style and genre (e.g. "Baroque — fugues") |
instrumentation | string | Scored for (e.g. "piano", "2 violins, viola, cello") |
key | string | Musical key |
composition_year | string | Year or date of composition |
first_publication | string | Year of first publication |
score_files | string | JSON array of score PDFs with filename, description, file URL, copyright, editor |
parts_files | string | JSON array of parts PDFs (same structure) |
arrangements | string | JSON array of arrangement PDFs (same structure) |
copyright_status | string | Copyright tag from IMSLP (almost always "Public Domain") |
license | string | Specific license |
imslp_url | string | Canonical IMSLP work page URL |
scraped_at | string | ISO 8601 timestamp |
File arrays are JSON-encoded strings. Each entry has: filename, description, editor, copyright, file_url.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 10 | Maximum works to return |
includeFileDetails | boolean | true | Fetch MediaWiki API for file lists, key, genre, instrumentation. Disable for faster bulk exports — you get the catalog skeleton without per-work details. |
composerFilter | string | — | Optional composer name filter (e.g. "Bach, Johann Sebastian"). Leave blank for the full catalog. |
File Detail Mode
When includeFileDetails is enabled, the scraper makes one additional MediaWiki API call per work to parse the work's wikitext. This populates score_files, parts_files, arrangements, instrumentation, key, genre, composition_year, and first_publication. It also adds ~200ms per record to the run time. For full-catalog exports where you only need the work index, disable it.
Coverage
IMSLP's public-domain mandate is not a coincidence. The library was built specifically to host scores where the copyright has expired or been dedicated to the public domain. The copyright_status field reflects IMSLP's own tagging — but the corpus is the corpus because legal reviews are baked in at submission time.
Score file URLs point to imslp.org/wiki/Special:ReverseLookup/<filename>, which resolves to the PDF download. These are the same URLs end users click in the IMSLP UI.
Use Cases
- Build a searchable public-domain score database
- Feed OMR (optical music recognition) or generative music training pipelines
- Music education platforms that need structured work metadata
- Digital library catalogs with direct PDF access
- Composer or instrumentation research at scale
Data Volume
The full catalog is approximately 230,000 works. Without a composer filter and with includeFileDetails enabled, a complete run takes several hours due to polite pacing between MediaWiki API calls. Use composerFilter to scope to a specific composer, or set includeFileDetails: false for a fast full-catalog index run.
Built by OrbTop. Data sourced from IMSLP via its public APIs.
