Goodreads Scraper avatar

Goodreads Scraper

Pricing

from $0.90 / 1,000 goodreads books

Go to Apify Store
Goodreads Scraper

Goodreads Scraper

Scrape public Goodreads books from search terms, book URLs, shelves, genres, lists, and author pages. Export ratings, authors, ISBNs, descriptions, covers, source ranks, and optional review rows.

Pricing

from $0.90 / 1,000 goodreads books

Rating

0.0

(0)

Developer

Maxime Dupré

Maxime Dupré

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

18 hours ago

Last modified

Categories

Share

📚 Goodreads scraper for books, shelves, and reviews

Goodreads Scraper collects public book data from Goodreads search terms, ISBNs, book URLs, shelves, genres, lists, search pages, author pages, and series pages. Add a Goodreads URL, ISBN, or a query such as science fiction, then export book titles, authors, ratings, review counts, descriptions, ISBNs, genres, covers, and scrape metadata from a clean Apify dataset.

Use this Goodreads scraper when you need repeatable book data for market research, reading-list curation, publishing research, recommendation datasets, SEO briefs, content planning, or internal book intelligence workflows. You can run it in Apify Console, schedule repeat runs, call it from the Apify API, and export results as JSON, CSV, Excel, XML, RSS, or HTML.

The Actor is built for public Goodreads pages. It does not require a Goodreads account, cookies, Goodreads API key, or user credentials. It only saves usable scraped results; invalid targets, unavailable pages, and source-side problems are reported in logs instead of placeholder dataset rows.

✅ What this Actor does

  • Scrapes Goodreads book results from search terms.
  • Looks up books by ISBN-10 or ISBN-13.
  • Scrapes public Goodreads book, shelf, genre, list, search, author, and series URLs.
  • Saves one dataset item per discovered book.
  • Optionally saves public review rows found on book pages.
  • Optionally saves author, list, and series metadata rows.
  • Filters books by rating, ratings count, publish year, language, genre, and author.
  • Opens book pages for detail enrichment when includeBookDetails is enabled.
  • Keeps source URL, source type, source rank, and scrape time on every book row.
  • Deduplicates books across all submitted searches and URLs by default.
  • Stops at your maxItems, maxPages, and maxReviewsPerBook limits.

For keyword search, the Actor uses Goodreads public autocomplete results. For shelf, genre, list, search URL, and author targets, it discovers book links from the submitted Goodreads pages and then enriches each book from its public book page.

📦 Data you can extract

Each book output item can include:

  • goodreadsId, url, and canonicalUrl
  • title
  • authors with author name, URL, and Goodreads ID when available
  • rating, ratingCount, and reviewCount
  • description
  • imageUrl
  • isbn and isbn13
  • publisher, publishedDate, and firstPublishedDate
  • pageCount, bookFormat, and language
  • series, genres, characters, and awards
  • buyLinks when available
  • sourceTargetUrl, sourceTargetType, sourcePage, and sourceRank
  • status, missingFields, and scrapedAt

When includeReviews is enabled and maxReviewsPerBook is greater than 0, the Actor can also save review rows with the source book URL, visible reviewer profile data, visible review text, HTML, shelves when available, and scrape time.

When includeMetadata is enabled, author, list, and series URL targets can also emit metadata rows with metadataType, source URL, title, description, book count when visible, and scrape time.

Some Goodreads pages do not expose every field for every book. Missing scalar values are returned as null, and missing lists are returned as empty arrays. The Actor does not invent metadata that is not visible in the public source page.

🚀 How to run

  1. Add one or more Goodreads search terms, ISBNs, Goodreads URLs, or a mix of all three.
  2. Keep Book limit at 25 or 50 for a quick first run.
  3. Keep Page limit per target low while testing shelf, genre, list, or author pages.
  4. Leave Include book details on if you want descriptions, ISBNs, page counts, genres, ratings, and review counts.
  5. Turn on Include reviews only when you need review rows, then set a small Review limit per book.
  6. Run the Actor and open the dataset in Apify Console, export it, or pull it through the Apify API.

For the fastest first run, use the prefilled science fiction search term or the prefilled https://www.goodreads.com/shelf/show/fantasy shelf URL with a small book limit.

⚙️ Input

{
"searchTerms": ["science fiction"],
"isbns": ["9780062316097"],
"targets": [
{
"url": "https://www.goodreads.com/series/45175-harry-potter"
}
],
"maxItems": 25,
"maxPages": 2,
"includeBookDetails": true,
"includeReviews": false,
"maxReviewsPerBook": 0,
"includeMetadata": true,
"minRating": 3.5,
"minRatingsCount": 10,
"deduplicateBooks": true
}

🔎 Search terms

Use plain Goodreads book searches such as:

  • science fiction
  • romantasy
  • Stephen King
  • business books
  • historical fiction

Each search term is searched separately. Empty and duplicate terms are ignored.

🔢 ISBNs

Use ISBN-10 or ISBN-13 values such as:

  • 9780062316097
  • 978-0-7432-7356-5
  • 0747532699

Hyphens and spaces are accepted. Each ISBN is searched on Goodreads and saved as a book row when Goodreads returns a public match.

🔗 Goodreads URLs

You can submit public Goodreads URLs such as:

  • Book pages: https://www.goodreads.com/book/show/5907.The_Hobbit
  • Shelf pages: https://www.goodreads.com/shelf/show/fantasy
  • Genre pages: https://www.goodreads.com/genres/most_read/fantasy
  • List pages: https://www.goodreads.com/list/show/1.Best_Books_Ever
  • Series pages: https://www.goodreads.com/series/45175-harry-potter
  • Search pages, author pages, and author book-list pages

The Actor skips unsupported or unavailable targets instead of saving fake rows.

🎚️ Limits and options

maxItems caps saved book rows across the whole run. maxPages caps listing pages scanned for each Goodreads URL target. includeBookDetails controls whether the Actor opens each book page for enriched fields. includeReviews and maxReviewsPerBook control optional review row output. includeMetadata controls optional author/list/series summary rows. deduplicateBooks keeps the same Goodreads book from being saved more than once in a run.

Filters apply when the relevant field is visible on Goodreads: minRating, minRatingsCount, publishYearMin, publishYearMax, language, containsGenre, and containsAuthor.

🧪 Output example

{
"rowType": "book",
"goodreadsId": "5907",
"url": "https://www.goodreads.com/book/show/5907.The_Hobbit",
"canonicalUrl": "https://www.goodreads.com/book/show/5907.The_Hobbit",
"title": "The Hobbit, or There and Back Again",
"authors": [
{
"name": "J.R.R. Tolkien",
"url": "https://www.goodreads.com/author/show/656983.J_R_R_Tolkien",
"goodreadsId": "656983"
}
],
"rating": 4.29,
"ratingCount": 4510933,
"reviewCount": 74706,
"description": "Bilbo Baggins is a hobbit who enjoys a comfortable, unambitious life...",
"imageUrl": "https://m.media-amazon.com/images/S/compressed.photo.goodreads.com/books/1546071216i/5907.jpg",
"isbn": "9780547928227",
"isbn13": null,
"publisher": null,
"publishedDate": null,
"firstPublishedDate": "First published September 21, 1937",
"pageCount": 366,
"bookFormat": "Paperback",
"language": "English",
"series": null,
"genres": ["Fantasy", "Classics", "Fiction"],
"characters": [],
"awards": [],
"buyLinks": [],
"sourceTargetUrl": "https://www.goodreads.com/shelf/show/fantasy",
"sourceTargetType": "shelf",
"sourcePage": 1,
"sourceRank": 1,
"status": "ok",
"missingFields": [],
"scrapedAt": "2026-05-27T19:17:00.000Z"
}

💳 Pricing

This Actor uses pay-per-event pricing. You are charged for each saved Goodreads book. If you enable review scraping, each saved Goodreads review is charged separately. If you enable author/list/series metadata, each saved metadata row is charged separately.

Book and metadata pricing by Apify tier:

TierPrice per 1,000
FREE$1.80
BRONZE$1.50
SILVER$1.15
GOLD$0.90
PLATINUM$0.90
DIAMOND$0.90

Review rows are $0.90 per 1,000.

Review scraping can create many more rows than book scraping, so keep maxReviewsPerBook small for your first review run.

⚠️ Limits and caveats

  • The Actor works with public Goodreads data only.
  • It does not scrape private Goodreads data, logged-in-only pages, or account-specific shelves.
  • Review extraction is best for visible public review cards on book pages; it does not guarantee every review for books with large review histories.
  • Source pages can change, omit fields, or temporarily return no usable data. In those cases, the Actor saves available book rows and reports problems in logs.
  • Large runs over shelves, genres, lists, or authors should use bounded limits first, then scale after you inspect the output shape.

❓ FAQ

🔐 Do I need a Goodreads account?

No. The Actor is designed for public Goodreads pages and does not ask for Goodreads cookies, credentials, or an API key.

💬 Can I scrape Goodreads reviews?

Yes. Turn on Include reviews and set Review limit per book above 0. Review rows are separate from book rows so CSV and spreadsheet exports stay easier to work with.

🔗 Can I use direct Goodreads book URLs?

Yes. Add public Goodreads book URLs to Goodreads URLs. You can also add shelves, genres, lists, search pages, and author book-list pages.

🧩 Why are some fields null?

Goodreads does not expose every field on every public page. The Actor keeps missing values as null or empty arrays rather than guessing.

🗓️ Can I schedule this Goodreads scraper?

Yes. After you save an input, you can schedule repeat Apify runs, call the Actor through the Apify API, or connect results to downstream tools with Apify integrations and webhooks.

📝 Changelog

  • 0.2: Added ISBN lookup, series URLs, optional author/list/series metadata rows, book filters, and lower tiered pay-per-event pricing.
  • 0.1: Initial release.

🆘 Support

For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡

🔗 Other actors

Made with ❤️ by Maxime Dupré