Pricing

Pay per usage

OpenLibrary Books — Metadata, ISBNs, Authors, CSV, No API Key

19 runs. OpenLibrary metadata as CSV/JSON — titles, authors, ISBNs, subjects, languages, pageCount, coverUrl, ebookAccess, ratings. By query/ISBN/subject/author. For library cataloguing + book-rec engines + academic research. No API key. Backed by 951-run Trustpilot flagship + 31-actor portfolio.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alex

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

OpenLibrary Book Scraper — Metadata, ISBNs, Authors, Subjects

Scrape book metadata from the free OpenLibrary API. No API key, no rate-limit token, no auth wall. Four input modes: search queries, ISBN lookups, subject browse, author works. Output JSON or CSV.

Built for: library data builds, reading-list automation, ISBN enrichment, book recommendation datasets, academic citation enrichment.

What this actor does (honest scope, verified against `src/main.js`)

Calls these public OpenLibrary endpoints under the hood:

Input field	Endpoint hit	Returns
`searchQueries`	`/search.json?q=…&page=N&limit=50`	50 docs/page, paginated until `maxBooksPerSource` reached or `numFound <= collected`
`isbns`	`/isbn/{isbn}.json`	One book per ISBN
`subjects`	`/subjects/{slug}.json?limit=min(maxBooksPerSource, 50)`	Hard-capped at 50 — even if you set `maxBooksPerSource=200`, subject browse returns at most 50
`authors`	`/search/authors.json?limit=1` + `/authors/{key}/works.json?limit=min(maxBooksPerSource, 50)`	First author match only (no disambiguation), then up to 50 works

Sets User-Agent: ApifyOpenLibraryScraper/1.0. Inserts polite delays between requests: 200ms after each work-description fetch, 300ms before each ISBN/subject/author lookup, 500ms between search-mode pages. If includeDescription=true (default), search-mode and isbn-mode fire one extra /works/{key}.json per book to pull the description text — slower but richer. Subject-mode and author-mode never fetch the work-description endpoint — they read whatever description is already in the listing payload.

Input parameters

Field	Type	Default	Description
`searchQueries`	array of strings	`[]`	Free-text search (title/keyword/phrase)
`isbns`	array of strings	`[]`	ISBN-10 or ISBN-13 lookups
`subjects`	array of strings	`[]`	Subject names — auto-lowercased and spaces replaced with underscores (e.g. `"Science Fiction"` → slug `science_fiction`). Special characters NOT escaped beyond URL-encoding — exotic subject names may 404.
`authors`	array of strings	`[]`	Author names (e.g. `"Isaac Asimov"`). Only the first match is taken (`limit=1`) — common-name authors may resolve to a different person than expected. Use the OpenLibrary author-key directly via a custom build if disambiguation matters.
`maxBooksPerSource`	integer	`50`	Cap per query/ISBN/subject/author (schema allows 1-200, but subjects and authors are server-capped at 50 regardless)
`includeDescription`	boolean	`true`	Fetch full description (extra API call per book in search-mode and isbn-mode only)

You can mix all four modes in a single run. Each output record carries a source field telling you which mode produced it (search:<query>, isbn:<n>, subject:<s>, author:<a>).

Output schema (varies by source mode — fields differ deliberately)

Records from different modes carry different field sets. This is by design — OpenLibrary returns richer metadata for search results than for ISBN / subject / author endpoints.

`search:` mode (22 base fields, +`description` with `includeDescription`, +2 metadata = up to 25)

{
  "title": "Foundation",
  "authors": ["Isaac Asimov"],
  "authorKeys": ["OL26320A"],
  "firstPublishYear": 1951,
  "publishYears": [1951, 1952, 1955, 1962, 1974],
  "isbn": "9780553293357",
  "allIsbns": ["9780553293357", "9780553382570", "..."],
  "subjects": ["Science fiction", "Galactic empire", "..."],
  "publishers": ["Bantam Spectra", "Doubleday", "..."],
  "languages": ["eng"],
  "pageCount": 244,
  "editionCount": 142,
  "coverUrl": "https://covers.openlibrary.org/b/id/9261361-L.jpg",
  "openLibraryKey": "/works/OL46828W",
  "openLibraryUrl": "https://openlibrary.org/works/OL46828W",
  "ebookAccess": "borrowable",
  "hasFulltext": true,
  "ratingsAverage": 4.12,
  "ratingsCount": 1284,
  "wantToRead": 8421,
  "currentlyReading": 412,
  "alreadyRead": 6203,
  "description": "In the waning days of a future Galactic Empire...",
  "source": "search:foundation",
  "scrapedAt": "2026-04-29T12:00:00.000Z"
}

Field caps in search-mode: allIsbns truncated to first 10, subjects truncated to first 20, publishers truncated to first 5. pageCount is number_of_pages_median (median across editions, not the specific-edition page count).

`isbn:` mode (10 base fields, +`description`+`subjects` if `includeDescription`=true)

title, isbn, publishers (uncapped), publishDate, pageCount (specific-edition number_of_pages, NOT median), coverUrl, openLibraryKey, openLibraryUrl, source, scrapedAt. With includeDescription=true, adds description and subjects (uncapped). Description fetch is wrapped in a silent try/catch — on failure, both description and subjects are simply absent (no error field, no retry).

`subject:` mode (10 fields)

title, authors (array of names — different shape than search-mode's authorKeys), coverUrl, openLibraryKey, openLibraryUrl, editionCount, firstPublishYear, subject, source, scrapedAt. No ratings, no ISBN, no description in this mode — that's an OpenLibrary /subjects/ endpoint limitation, not ours. Server hard-caps to 50 records regardless of maxBooksPerSource.

`author:` mode (7 base fields, +`description` if includeDescription=true)

title, authors (1-element array with the resolved author name), authorKey (singular — different from search-mode's plural authorKeys), openLibraryKey, openLibraryUrl, covers (capped to first 3 cover URLs), source, scrapedAt. With includeDescription=true, adds description IF the author-works payload already contains it (no extra API call — purely best-effort). Server hard-caps to 50 works regardless of maxBooksPerSource.

Field-name asymmetry across modes: search-mode emits authorKeys (plural array) + coverUrl (single URL); author-mode emits authorKey (singular string) + covers (array of up to 3); subject-mode and isbn-mode emit neither. If you join across modes, normalize these explicitly.

Operational caveats

⚠️ Outer try/catch wraps the entire 4-mode for-loop (src/main.js lines 57-222). ISBN, subject, and author loops have inner try/catch so individual lookup failures don't halt their batch. BUT search-mode does NOT have inner protection — a single search-API failure (e.g. transient HTTP 500, network blip) kills the run mid-stream and skips ALL remaining search queries, ISBN lookups, subject browses, and author lookups. Run problematic queries in isolation if dropout matters.
No retry / no proxy. Single fetch() per URL. Heavy bursts may eventually trigger OpenLibrary's polite-use ceiling (~100 req/min unofficial); the actor will surface that as a thrown HTTP error.
Description-fetch silent-empty. When includeDescription=true and the work-page fetch fails, description is set to empty string (search-mode) or absent (isbn-mode) — no error is logged per book.
Subject slug transform is naive. Input "Science Fiction" → slug "science_fiction". Special characters beyond letters/spaces are URL-encoded but not slug-normalized; subjects like "René Magritte's books" will likely 404.

What this actor does NOT do

No reading-progress / personal-list scraping — OpenLibrary doesn't expose individual users' lists.
No full-text book content — only metadata + descriptions. Read free books at openlibrary.org or via Internet Archive.
No price comparison — OpenLibrary is metadata-only, not a bookstore.
No deduplication across modes — if you search "Foundation" and lookup ISBN 9780553293357, you'll get 2 records. Dedupe by openLibraryKey post-run if needed.
No incremental crawl / cursor state — each run starts fresh from page 1.
No author disambiguation — first match wins.

When this stops being enough

If you need book full-text → use Internet Archive. If you need real-time bookstore prices → write a separate Amazon/Bookshop scraper. If you need annotated bibliographies → look at Goodreads (no public API since 2020, harder).

Custom builds — pilot tiers

This actor runs on Apify's standard compute. If you need a custom variant — search-mode-only with retry+backoff, ISBN-bulk with deduplication, subject browse paginated past the 50-cap (via search workaround), author-key direct lookup, hourly cron, Slack alerts on new releases — three tiers:

Pilot — $97 · 1 actor, basic config, 7-day support. Good for one-off "top 200 books in subject X" via search + subject hybrid.
Standard — $297 · custom actor + Slack/email alerts on results, 30-day support. Most reading-list / catalog-enrichment projects fit here.
Premium — $797 · custom actor + dashboard + 90-day support + 1 modification round. For ongoing pipelines (weekly new-release feed, ISBN-stream enrichment, author-tracking dashboards).

Email: spinov001@gmail.com — drop the input shape and the schema you need; quote within 48h.

Proof of work: 31 published Apify scrapers (78 total in portfolio) — Trustpilot 949 runs, Reddit 80+, Google News 43, Glassdoor 37, Email Extractor 36+. Recently delivered a paid 3-article series for a client in the proxy industry ($150).

More tips: t.me/scraping_ai · blog.spinov.online

Source	Actor	Data
OpenLibrary (this)	Book metadata + ISBN/subject/author	Bibliographic
Wikipedia Scraper	Article + sections + references	Encyclopedic
arXiv Paper Scraper	Academic preprints	Research
[Google Books style — request a custom build via email]	—	—

All 31 published actors free to inspect on Apify Store.

Disclaimer

Scrapes the publicly accessible OpenLibrary API endpoints. Respects polite delays (200-500ms between requests). Not affiliated with the Internet Archive or OpenLibrary.

Honest disclosure: search-mode 22 base fields (up to 25 with description + 2 metadata fields), isbn-mode 10 base, subject-mode 10 fields, author-mode 7 base. Subject and author endpoints server-capped at 50 records regardless of maxBooksPerSource. Outer try/catch — single search-API failure halts the entire run. Single-attempt fetch, no retry/no proxy. Author-mode uses limit=1 for disambiguation — first match wins.

OpenLibrary Book Search - Books & Authors

vernacular_reservoir/openlibrary-book-search

Search millions of books from OpenLibrary. Find books by title, author, subject or ISBN. Extract title, authors, publish year, ratings, subjects, publishers, cover image and description. No API key required.

Aleksandrs

Openlibrary Book Intelligence

benthepythondev/openlibrary-book-intelligence

Search and extract book data from Open Library's database of 20+ million books. Get titles, authors, publishers, publication dates, ISBNs, covers, subjects, and edition info. Search by title, author, ISBN, or subject. Free alternative to Google Books API.

ben

OpenLibrary Books Scraper - Search, Authors & Editions

wetyr_corporation/openlibrary-books-scraper

Bulk extract book metadata from OpenLibrary (Internet Archive). Search by query, subject, author. Free public API, no auth. For AI training, publishing intel, library systems.

WETYR

Open Library Book Search

gentle_cloud/open-library-book-search

Search and extract book data from Open Library (openlibrary.org) — titles, authors, publishers, ISBNs, ratings, reading stats, cover images, and more. Free API, no key required.

Monkey Coder

OpenLibrary Scraper

lulzasaur/openlibrary-scraper

Scrape OpenLibrary.org — open book database with 40M+ records. Search by title/author, ISBN lookup, or browse by subject. Get metadata, covers, editions, and more via free JSON API.

lulz bot

Open Library Scraper — Books, Authors & Editions

openclawmara/openlibrary-scraper

Scrape Open Library (Internet Archive) for books, authors, and editions. Modes: search by title/author/subject, book details by ISBN/OLID, author works, recent additions. Extracts titles, authors, ISBNs, covers, subjects, publish dates, editions. Uses official Search & Works API. No auth.

OpenClaw Mara

Open Library Book Scraper

cloud9_ai/openlibrary-scraper

Search and extract book data from Open Library. Get ISBNs, authors, editions, covers for 40M+ books. No API key needed.

cloud9

Open Library ISBN Book Metadata Scraper

jungle_synthesizer/openlibrary-isbn-book-metadata-scraper

Bulk-enrich ISBNs with full Open Library metadata: title, authors, publishers, subjects, ratings, reading-status counts, and cross-reference identifiers (Goodreads, LibraryThing, LCCN, OCLC, Wikidata). Accepts up to thousands of ISBNs in a single run.

BowTiedRaccoon

Open Library Books Scraper

gio21/openlibrary-books-scraper

Search and scrape books on Open Library by title, author, subject, or ISBN. Returns title, authors, first publish year, edition count, ISBNs, cover image, language, ebook access status. Pay per book returned.