Goodreads Book Scraper avatar
Goodreads Book Scraper

Pricing

$29.00/month + usage

Go to Apify Store
Goodreads Book Scraper

Goodreads Book Scraper

Goodreads Book Scraper is an Apify Actor that extracts book details from Goodreads search results. It retrieves the title, author, rating, ratings count, published year, editions count, book URL, and cover image URL, outputting the data in structured JSON format.

Pricing

$29.00/month + usage

Rating

1.1

(2)

Developer

scraping automation

scraping automation

Maintained by Community

Actor stats

3

Bookmarked

44

Total users

2

Monthly active users

19 days ago

Last modified

Share

Scrape book data from Goodreads search results for a given query (keyword, title, author, ISBN).

Quick start

  1. Set searchTerm (required).
  2. Optionally set bookMax and enableDetailedScraping.
  3. Run the Actor and read results from the default dataset.

Output

Where results are stored

  • Dataset: results are written to the default dataset.
  • Run output: the Actor writes an OUTPUT record to the default key-value store so the Output tab is enabled and includes dataset/log links.

Dataset item types

Each dataset item is one of:

  • Book row (found: true): scraped book data.
  • Diagnostic row (found: false): explains why no book rows were produced.

Common diagnostic reasons

reasonmeaning
NO_RESULTSNo matching books were found for the query.
BLOCKED_OR_CAPTCHAGoodreads likely blocked automation (captcha / unusual traffic / sign-in wall).
BOOKS_SELECTOR_NOT_FOUNDGoodreads UI/layout changed, or the page did not render book rows.
PAGE_BODY_NOT_FOUNDNavigation/response issue (page body not available).

Book fields

Core fields (always present on book rows):

  • title, author, rating, ratingsCount, published, editions, url, coverUrl

Detailed fields (when enableDetailedScraping = true, best-effort):

  • pages, format, firstPublished, currentlyReading, wantToRead, description, genres, authorBooks, authorFollowers

Input

All fields and defaults are defined in .actor/input_schema.json. Most-used options:

Required

  • searchTerm (string): Goodreads query (keyword/title/author/ISBN).
  • bookMax (integer): max number of books to push to the dataset.
  • enableDetailedScraping (boolean): visit each book page for extra fields (slower).
  • stopOnNoResults (boolean, default true): stop successfully after page 1 if no results are found.

Timeouts (advanced)

  • navigationTimeout (ms): timeout for search page navigation.
  • requestHandlerTimeout (ms): max time to process one search results page.
  • detailNavigationTimeoutMs (ms): timeout for individual book detail pages.

Usage examples

Keyword search (fast)

{
"searchTerm": "architecture",
"bookMax": 20,
"enableDetailedScraping": false
}

ISBN lookup (graceful when not found)

{
"searchTerm": "978-2-07-061275-845645646",
"bookMax": 5,
"enableDetailedScraping": false,
"stopOnNoResults": true
}

Detailed scraping (richer fields)

{
"searchTerm": "Brunelleschi's Dome",
"bookMax": 5,
"enableDetailedScraping": true,
"detailNavigationTimeoutMs": 30000
}

Notes / limitations

  • Goodreads may rate-limit or block automated access. When that happens, the dataset will include a diagnostic row with found: false and reason: "BLOCKED_OR_CAPTCHA".