Goodreads Scraper avatar

Goodreads Scraper

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Goodreads Scraper

Goodreads Scraper

Scrape Goodreads book data. Search by title, author, or ISBN. Returns ratings, reviews, genres, page counts, and publication info.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

lulz bot

lulz bot

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

13 hours ago

Last modified

Categories

Share

An Apify Actor that scrapes book data from Goodreads shelf/genre pages. Built with Crawlee CheerioCrawler for fast, efficient HTML parsing.

What it does

This scraper extracts book data from Goodreads shelf pages (e.g., goodreads.com/shelf/show/science-fiction). Each shelf page lists 50 popular books in that genre, ranked by how many users shelved them.

Two modes of operation

  1. Shelf mode (default): Scrapes book listings from shelf pages. Fast -- extracts title, author, rating, rating count, published year, shelf count, and cover image directly from the listing.

  2. Detail mode (scrapeDetails: true): Also visits each book's individual page to extract rich structured data from Goodreads' JSON-LD schema, including full description, genres, page count, ISBN, book format, language, awards, and review count.

Input

ParameterTypeDefaultDescription
searchQueriesstring[]["science-fiction"]Shelf/genre names to scrape. Maps to goodreads.com/shelf/show/{name}.
maxBooksinteger100Maximum books per shelf. Set to 0 for unlimited.
scrapeDetailsbooleanfalseVisit each book's detail page for full data (slower).
proxyConfigurationobject{ useApifyProxy: false }Proxy settings for large-scale runs.

science-fiction, fantasy, mystery, romance, non-fiction, thriller, horror, historical-fiction, young-adult, classics, biography, self-help, poetry, graphic-novels, philosophy, true-crime, dystopian, adventure, humor, manga

Output

Shelf mode fields

FieldTypeDescription
titlestringBook title (may include series info)
authorstringAuthor name
authorUrlstringLink to author's Goodreads page
ratingnumberAverage rating (1-5 scale)
ratingCountnumberTotal number of ratings
publishedYearnumberYear of first publication
shelfCountnumberTimes shelved under this genre
coverImagestringURL of book cover image
bookIdstringGoodreads book ID
urlstringFull Goodreads URL for the book
searchQuerystringThe shelf/genre name used
scrapedAtstringISO 8601 timestamp

Additional detail mode fields

FieldTypeDescription
descriptionstringFull book description
genresstring[]List of genres/tags
isbnstringISBN number
bookFormatstringFormat (Hardcover, Paperback, etc.)
numberOfPagesnumberPage count
languagestringBook language
awardsstringAwards received
reviewCountnumberTotal number of text reviews
authorsobject[]Array of author objects with name and URL
publicationInfostringFull publication details

Example usage

Basic: Top science fiction books

{
"searchQueries": ["science-fiction"],
"maxBooks": 50
}

Multiple genres with details

{
"searchQueries": ["fantasy", "mystery", "romance"],
"maxBooks": 100,
"scrapeDetails": true
}

Large-scale with proxy

{
"searchQueries": ["science-fiction", "fantasy", "thriller", "horror"],
"maxBooks": 500,
"scrapeDetails": true,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Example output

{
"title": "Dune (Dune, #1)",
"author": "Frank Herbert",
"authorUrl": "https://www.goodreads.com/author/show/58.Frank_Herbert",
"rating": 4.29,
"ratingCount": 1645579,
"publishedYear": 1965,
"shelfCount": 21643,
"coverImage": "https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/books/1555447414l/44767458._SX318_.jpg",
"bookId": "44767458",
"url": "https://www.goodreads.com/book/show/44767458-dune",
"searchQuery": "science-fiction",
"scrapedAt": "2026-03-17T12:00:00.000Z"
}

Technical notes

  • Uses CheerioCrawler (HTTP-only, no browser) for maximum speed
  • Shelf pages (/shelf/show/) and book pages (/book/show/) are allowed by Goodreads robots.txt
  • Rate-limited to avoid overloading servers (30-40 requests/minute)
  • Shelf pages return 50 books per page, paginated with ?page=N
  • Detail pages use Goodreads' JSON-LD @type: Book structured data for reliable extraction
  • No Cloudflare or CAPTCHA protection on these endpoints

Running locally

$apify run --purge

Make sure to set your input in storage/key_value_stores/default/INPUT.json.

Deploy to Apify

apify login
apify push

More marketplace scrapers and data tools by lulzasaur: