Open Library Scraper
Pricing
from $3.00 / 1,000 results
Open Library Scraper
Scrape Open Library, Internet Archive's open catalog of 50M+ books. Search by title/author/subject, fetch by ISBN or work ID, get full bibliographic metadata, cover images, ratings, and edition counts.
Pricing
from $3.00 / 1,000 results
Rating
5.0
(13)
Developer
Crawler Bros
Actor stats
13
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Scrape Open Library — Internet Archive's open catalog of 50M+ books. Search by title / author / subject, fetch by ISBN or work ID, get full bibliographic metadata, cover images, edition counts, and full-text availability. HTTP-only via the public openlibrary.org JSON API. No auth, no proxy.
What this actor does
- Four modes:
search,byIsbn,byWorkIds,byAuthorIds - Universal IDs: OpenLibrary work IDs, ISBN-10, ISBN-13, author IDs
- Filters: publication year range, min edition count, language, full-text availability
- Cover images: auto-builds CDN URLs at S/M/L sizes
- Empty fields are omitted
Output per book
workId— OpenLibrary work key (e.g.OL27482W)title,subtitle,descriptionfirstPublishYear,editionCountauthors[],primaryAuthor,authorIds[]subjects[],subjectPlaces[],subjectPeople[],subjectTimes[]languages[]— 3-letter ISO codesisbn10[],isbn13[]— capped at 5 eachcoverUrl— large cover via covers.openlibrary.org CDNhasFulltext,publicScan,ebookAccessopenLibraryUrlrecordType: "book",scrapedAt
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | search | search / byIsbn / byWorkIds / byAuthorIds |
searchQuery | string | the lord of the rings | Free-text query |
queryAuthor | string | – | Constrain to author |
querySubject | string | – | Constrain to subject |
isbns | array | – | ISBNs (mode=byIsbn) |
workIds | array | – | Work IDs (mode=byWorkIds) |
authorIds | array | – | Author IDs (mode=byAuthorIds) |
publishYearMin | int | – | Drop works before this year |
publishYearMax | int | – | Drop works after this year |
minEditionCount | int | – | Drop works with fewer editions |
language | string | – | 3-letter ISO (e.g. eng, fre) |
includeFulltextOnly | bool | false | Only emit works with full-text on Internet Archive |
maxItems | int | 50 | Hard cap (1–1000) |
Example: classic fantasy with full-text scans
{"mode": "search","searchQuery": "fantasy","publishYearMin": 1900,"publishYearMax": 2000,"minEditionCount": 10,"includeFulltextOnly": true}
Example: lookup by ISBN
{"mode": "byIsbn","isbns": ["9780261103573", "9780261102217", "0-261-10357-X"]}
Example: all of an author's works
{"mode": "byAuthorIds","authorIds": ["OL26320A"],"maxItems": 200}
Use cases
- Library systems — bulk-import metadata from Open Library by ISBN
- Edtech — discover books by subject for curriculum design
- Content discovery — find books by author / subject / time period
- Recommendation engines — feed Open Library subject taxonomy into your recommender
- Publishing intelligence — track edition counts to gauge popularity
- Academic research — bulk-export a subject's bibliographic record
FAQ
What's Open Library? An open, editable, library-grade catalog from Internet Archive. ~50M books, ~10M authors, free for any use. See openlibrary.org.
Is there a rate limit? Generous; no documented hard cap for normal scraping. The actor uses small delays to be polite.
What are work IDs vs edition IDs? A "work" is the abstract concept of a book (e.g. "The Hobbit"); an "edition" is a specific publication of it (paperback 1995 ed.). The actor returns work-level records by default.
Why are some ISBN lists capped at 5? Popular works have hundreds of editions, each with its own ISBN. We keep the first 5 for table compactness; full lists are available via per-edition queries.
How do subjects, subjectPlaces, subjectPeople, subjectTimes differ? Open Library's subject taxonomy is faceted: regular subjects (Fantasy), geographic places (Middle-earth), people (Bilbo Baggins), and time periods (Third Age).
What's ebookAccess? borrowable (Internet Archive lending), printdisabled (only for users with print disabilities), or no_ebook (text not digitized).
Can I get cover images at other sizes? The actor emits the L (large) URL. To get S or M, swap the suffix in the URL: -L.jpg → -M.jpg or -S.jpg.
How fresh is the data? Daily — Open Library re-indexes nightly from edits, ISBN ingestion, and Internet Archive's MARC import pipeline.