Open Library Scraper avatar

Open Library Scraper

Pricing

Pay per event

Go to Apify Store
Open Library Scraper

Open Library Scraper

Comprehensive scraper for Open Library to extract books, authors, subjects, and list data from the Internet Archive’s platform. Supports multiple search types and ebook filtering, providing automated, structured access to Open Library’s extensive bibliographic collection.

Pricing

Pay per event

Rating

5.0

(1)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

1

Bookmarked

6

Total users

0

Monthly active users

2 days ago

Last modified

Share

ParseForge Banner

📚 Open Library Scraper

🚀 Extract book data from Open Library in seconds. Search by title, author, or subject with ebook filtering. No coding, no API keys required.

🕒 Last updated: 2026-04-16 · 📊 20 fields · 🔍 5 search types · 📖 Supports books, authors, subjects, lists, and full-text search

Open Library is the Internet Archive's free, open catalog of every book ever published. This scraper connects to Open Library's public API and returns structured book data including titles, authors, ISBNs, publishers, cover images, ratings, descriptions, page counts, and download links. It supports 5 search types (Books, Authors, Search Inside, Subjects, and Lists), handles pagination automatically, and exports data as JSON, CSV, or Excel.

Whether you are building a book recommendation engine, tracking ISBN availability, or conducting bibliographic research, this actor delivers structured data for up to 1,000,000 records per run for paid users. Each result includes cover images, publication dates, publisher names, available formats (PDF, EPUB, AZW3), community ratings, edition counts, and subject classifications. No manual searching, copying, or format conversion needed.

🎯 Target Audience💡 Use Cases
LibrariansBuild digital catalogs with cover images and ISBNs
PublishersResearch publication history and edition counts
Book bloggersGenerate reading lists with ratings and descriptions
Data scientistsAnalyze publishing trends by subject and year
App developersFeed book metadata into recommendation engines
EducatorsCurate subject-specific reading lists for courses

📋 What the Open Library Scraper does

  • 🔍 Keyword search across books, authors, subjects, lists, and full-text content
  • 📖 Ebook filtering to show only books available as free digital downloads
  • 🖼️ Cover image extraction with URLs for small, medium, and large sizes
  • 📊 Edition and rating data including community ratings and total edition counts
  • 📥 Download link collection for PDF, EPUB, and AZW3 formats when available
  • 🌐 Direct URL support to scrape any Open Library search results page

The scraper sends your query to Open Library's public API, retrieves matching records, and extracts full metadata for each item. For book searches, it collects titles, authors, ISBNs, publishers, page counts, descriptions, cover images, ratings, available formats, and download links. For author searches, it returns author profiles with their works. Every record is timestamped and includes a direct link to the Open Library entry.

💡 Why it matters: Open Library contains metadata for millions of books, but browsing and exporting data manually is tedious. This scraper automates collection and delivers clean, structured data ready for databases, spreadsheets, or applications.


🎬 Full Demo

🚧 Coming soon...


⚙️ Input

FieldTypeRequiredDescription
searchQuerystringNoSearch term for books, authors, or subjects (e.g., "Space")
searchTypestringNoWhat to search: books, authors, searchInside, subjects, or lists
ebooksOnlybooleanNoShow only ebooks (only works with "books" search type)
startUrlstringNoDirect Open Library search URL (overrides search filters)
maxItemsintegerNoMax results to collect. Free: up to 10. Paid: up to 1,000,000

Example 1: Basic book search

{
"searchQuery": "Space",
"searchType": "books",
"maxItems": 10
}

Example 2: Ebook-only author search

{
"searchQuery": "Isaac Asimov",
"searchType": "books",
"ebooksOnly": true,
"maxItems": 50
}

⚠️ Good to Know: Use either a Start URL or search filters, not both. If you provide a Start URL, search filters are ignored. The ebooks-only filter only works with the "books" search type.


📊 Output

🧾 Schema

EmojiFieldTypeDescription
🖼️coverImagestringBook cover image URL
📝titlestringFull book title
👤authorstringAuthor name
🔗detailUrlstringDirect link to the Open Library entry
📅publicationDatestringFirst publication date
🏢publisherstringPublisher name
📊isbnstringISBN identifier
🌍languagestringAvailable languages
📖pageCountnumberNumber of pages
📄descriptionstringFull book description
🏷️subjectsarraySubject classifications
🏷️subjectTagsarraySubject tag identifiers
🖼️coverImagesobjectCover image URLs in multiple sizes
💾availableFormatsarrayEbook formats (PDF, EPUB, AZW3)
📥downloadLinksarrayDirect download links for ebooks
ratingnumberCommunity rating score
📊editionCountnumberNumber of published editions
🆔itemIdstringOpen Library work identifier
scrapedAtstringTimestamp of when the record was collected
⚠️errorstringError message if processing failed

📦 Sample records


✨ Why choose this Actor

FeatureDetails
🔍 5 search typesBooks, Authors, Search Inside, Subjects, and Lists
📖 Ebook filteringShow only books available for free digital download
🖼️ Cover imagesURLs in small, medium, and large sizes
📥 Download linksDirect links to PDF, EPUB, and AZW3 files
⭐ Community ratingsRating scores and edition counts
📊 Full metadataISBNs, publishers, page counts, descriptions, and subjects
📦 Flexible exportJSON, CSV, or Excel output for any use case

📊 Collect up to 1,000,000 book records per run with cover images, ISBNs, download links, and ratings.


📈 How it compares to alternatives

FeatureThis ActorManual BrowsingGeneric Scrapers
5 search types
Ebook-only filtering
Cover image URLsVaries
Download link extractionManual
Bulk collection (1M+ records)
Structured JSON/CSV outputVaries
Scheduled runs

Get structured book data at scale without manual browsing or copy-pasting.


🚀 How to use

  1. Create an Apify account - Sign up free with $5 credit
  2. Open the Open Library Scraper - Navigate to the actor page on Apify
  3. Enter your search query - Type a title, author name, or subject
  4. Select search type and filters - Choose Books, Authors, Subjects, etc. and enable ebook filtering if needed
  5. Click Start - The actor collects matching records and delivers structured data

⏱️ A typical run with 10 books completes in under 30 seconds.


💼 Business use cases

📚 Library Management
  • Build digital catalogs with ISBNs and cover images
  • Track edition availability across languages
  • Identify books available as free ebooks
  • Cross-reference holdings with Open Library records
📊 Publishing Research
  • Analyze edition counts and publication histories
  • Track subject trends by publication year
  • Compare publisher catalogs across genres
  • Monitor new releases in specific subject areas
🖥️ App Development
  • Feed book metadata into recommendation engines
  • Populate product databases with cover images and ISBNs
  • Build reading list applications with ratings
  • Create book discovery tools with subject filtering
🎓 Education
  • Curate subject-specific reading lists for courses
  • Find free ebook versions of required textbooks
  • Build bibliographies with full publication details
  • Track available formats for accessibility planning

🔌 Automating Open Library Scraper

Integrate the Open Library Scraper into your workflow using the Apify API or client libraries.

Node.js:

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor("parseforge/open-library-scraper").call({
searchQuery: "Space",
searchType: "books",
maxItems: 50
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("parseforge/open-library-scraper").call(run_input={
"searchQuery": "Space",
"searchType": "books",
"maxItems": 50
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(items)

Schedules: Set up recurring runs to track new book additions, monitor ebook availability, or build growing datasets of book metadata. Configure daily, weekly, or monthly schedules from the Apify Console.


❓ Frequently Asked Questions


🔌 Integrate with any app

  • 🔗 Make (Integromat) - Connect book data to Google Sheets, Airtable, or any of 1,500+ apps
  • 🔗 Zapier - Trigger workflows when new book records are collected
  • 🔗 Slack - Get notified when a book data run completes
  • 🔗 Airbyte - Stream book metadata into your data warehouse
  • 🔗 GitHub - Store book datasets in repositories for version control
  • 🔗 Google Drive - Automatically save CSV exports to shared folders

ActorDescription
PubMed Citation ScraperExtract publication metadata from PubMed for research analysis
Crossref ScraperExtract DOI metadata for 155M+ research publications
NASA Reports ScraperCollect technical reports from NASA's NTRS database
US Census Bureau ScraperExtract demographic and economic data from the Census Bureau
ROR ScraperCollect research organization data from the Research Organization Registry

💡 Pro Tip: Combine the Open Library Scraper with the Crossref Scraper to match book ISBNs with DOI metadata and citation counts.


🆘 Need Help? Open our contact form and we will get back to you within 24 hours. We are happy to help with custom setups, integrations, or feature requests.


Disclaimer: This actor is not affiliated with, endorsed by, or connected to Open Library or the Internet Archive. It accesses publicly available data through Open Library's public API. Use responsibly and in accordance with applicable terms of service.