Open Library Scraper avatar

Open Library Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Open Library Scraper

Open Library Scraper

Scrape Open Library, Internet Archive's open catalog of 50M+ books. Search by title/author/subject, fetch by ISBN or work ID, get full bibliographic metadata, cover images, ratings, and edition counts.

Pricing

from $3.00 / 1,000 results

Rating

5.0

(13)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

13

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Scrape Open Library — Internet Archive's open catalog of 50M+ books. Search by title / author / subject, fetch by ISBN or work ID, get full bibliographic metadata, cover images, edition counts, and full-text availability. HTTP-only via the public openlibrary.org JSON API. No auth, no proxy.

What this actor does

  • Four modes: search, byIsbn, byWorkIds, byAuthorIds
  • Universal IDs: OpenLibrary work IDs, ISBN-10, ISBN-13, author IDs
  • Filters: publication year range, min edition count, language, full-text availability
  • Cover images: auto-builds CDN URLs at S/M/L sizes
  • Empty fields are omitted

Output per book

  • workId — OpenLibrary work key (e.g. OL27482W)
  • title, subtitle, description
  • firstPublishYear, editionCount
  • authors[], primaryAuthor, authorIds[]
  • subjects[], subjectPlaces[], subjectPeople[], subjectTimes[]
  • languages[] — 3-letter ISO codes
  • isbn10[], isbn13[] — capped at 5 each
  • coverUrl — large cover via covers.openlibrary.org CDN
  • hasFulltext, publicScan, ebookAccess
  • openLibraryUrl
  • recordType: "book", scrapedAt

Input

FieldTypeDefaultDescription
modestringsearchsearch / byIsbn / byWorkIds / byAuthorIds
searchQuerystringthe lord of the ringsFree-text query
queryAuthorstringConstrain to author
querySubjectstringConstrain to subject
isbnsarrayISBNs (mode=byIsbn)
workIdsarrayWork IDs (mode=byWorkIds)
authorIdsarrayAuthor IDs (mode=byAuthorIds)
publishYearMinintDrop works before this year
publishYearMaxintDrop works after this year
minEditionCountintDrop works with fewer editions
languagestring3-letter ISO (e.g. eng, fre)
includeFulltextOnlyboolfalseOnly emit works with full-text on Internet Archive
maxItemsint50Hard cap (1–1000)

Example: classic fantasy with full-text scans

{
"mode": "search",
"searchQuery": "fantasy",
"publishYearMin": 1900,
"publishYearMax": 2000,
"minEditionCount": 10,
"includeFulltextOnly": true
}

Example: lookup by ISBN

{
"mode": "byIsbn",
"isbns": ["9780261103573", "9780261102217", "0-261-10357-X"]
}

Example: all of an author's works

{
"mode": "byAuthorIds",
"authorIds": ["OL26320A"],
"maxItems": 200
}

Use cases

  • Library systems — bulk-import metadata from Open Library by ISBN
  • Edtech — discover books by subject for curriculum design
  • Content discovery — find books by author / subject / time period
  • Recommendation engines — feed Open Library subject taxonomy into your recommender
  • Publishing intelligence — track edition counts to gauge popularity
  • Academic research — bulk-export a subject's bibliographic record

FAQ

What's Open Library? An open, editable, library-grade catalog from Internet Archive. ~50M books, ~10M authors, free for any use. See openlibrary.org.

Is there a rate limit? Generous; no documented hard cap for normal scraping. The actor uses small delays to be polite.

What are work IDs vs edition IDs? A "work" is the abstract concept of a book (e.g. "The Hobbit"); an "edition" is a specific publication of it (paperback 1995 ed.). The actor returns work-level records by default.

Why are some ISBN lists capped at 5? Popular works have hundreds of editions, each with its own ISBN. We keep the first 5 for table compactness; full lists are available via per-edition queries.

How do subjects, subjectPlaces, subjectPeople, subjectTimes differ? Open Library's subject taxonomy is faceted: regular subjects (Fantasy), geographic places (Middle-earth), people (Bilbo Baggins), and time periods (Third Age).

What's ebookAccess? borrowable (Internet Archive lending), printdisabled (only for users with print disabilities), or no_ebook (text not digitized).

Can I get cover images at other sizes? The actor emits the L (large) URL. To get S or M, swap the suffix in the URL: -L.jpg-M.jpg or -S.jpg.

How fresh is the data? Daily — Open Library re-indexes nightly from edits, ISBN ingestion, and Internet Archive's MARC import pipeline.