Pricing

Pay per event

Goodreads Scraper - Books, Authors, Ratings, ISBN & Reviews

Scrape Goodreads books, authors and lists. Title, ISBN, pages, format, language, rating, ratings count, reviews count, author. HTTP only, $5/1K.

Pricing

Pay per event

Rating

0.0

(0)

Developer

deusex machine

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Goodreads Scraper — Books, Authors, Ratings, ISBN & Reviews

Scrape Goodreads — the world's largest book community with 150M members and 4 billion+ ratings — and extract complete book metadata, author profiles and curated book lists. HTTP-only, no browser, $5 per 1,000 items ($0.005 each).

If you build an author platform, run a publishing house, sell book-discovery apps, analyze the literary market, write academic papers about reading trends, or train a recommendation model on book data, this Goodreads scraper turns the canonical book-ratings graph into a clean structured feed in seconds.

Why use this Goodreads scraper

Goodreads is the definitive book database — every English-language book published since 2007 lives here, with millions of user ratings and reviews per major title. But Goodreads has no public API since 2020 (Amazon shut down the legacy API and never replaced it), and they actively block headless scraping.

This actor extracts data from the canonical JSON-LD blocks Goodreads ships on every book detail page. That means:

✅ Stable selectors — <script type="application/ld+json"> is part of SEO and Goodreads cannot remove it without losing search engine ranking
✅ Complete fields — title, ISBN, ISBN-13, page count, format, language, description, cover, author(s) with URLs, average rating, ratings count, reviews count
✅ No anti-bot encountered on book, author and list pages
✅ Fast — typically 4–6 books per second per worker
✅ Cheap — $0.005 per record ($5 per 1,000), the lowest among book-database scrapers in the Apify Store

What this Goodreads scraper extracts

Per book (`/book/show/...`)

Field	Description	Example
`bookId`	Goodreads internal book ID	`16299`
`slug`	URL-safe title slug	`And_Then_There_Were_None`
`url`	Canonical book URL	`https://www.goodreads.com/book/show/16299...`
`title`	Title as displayed on Goodreads	`And Then There Were None`
`authors`	Array of `{name, url}` — supports multi-author books	`[{name: "Agatha Christie", url: "..."}]`
`isbn`	Goodreads-canonical ISBN	`9780312330873`
`numberOfPages`	Page count (integer)	`264`
`bookFormat`	Hardcover, Paperback, Kindle, Audible, etc	`Paperback`
`language`	Edition language	`English`
`description`	Marketing description (from OG tag)	`"First, there were ten—a curious assortment..."`
`coverImage`	High-resolution cover image URL	`https://m.media-amazon.com/images/...`
`rating`	Average rating (1.00–5.00)	`4.27`
`ratingsCount`	Total number of ratings	`1,662,794`
`reviewsCount`	Total written reviews	`86,031`
`scrapedAt`	ISO 8601 UTC timestamp	`2026-05-18T20:34:12+00:00`

Per author (`/author/show/...`)

Field	Example
`authorId`	`123715`
`name`	`Agatha Christie`
`born` / `died`	`September 15, 1890` / `January 12, 1976`
`website` / `twitter`	author's social presence
`genres`	`["Mystery", "Fiction", "Crime"]`
`avgRating` / `ratingsCount`	aggregate across all the author's books
`image`	author photo URL
`books`	up to 30 visible works `[{title, url}]` from the profile page
`booksCount`	length of `books`

Per list (`/list/show/...`)

Field	Example
`listId` / `slug`	`1` / `Best_Books_Ever`
`title`	`Best Books Ever`
`description`	Marketing copy of the list
`books`	`[{bookId, title, url}]` array (up to 100 by default)
`booksCount`	length of `books`

If you enable enrichBooksFromLists: true, every book referenced in the list is also fetched individually and emitted as a separate type: "book" record with full metadata (ISBN, page count, rating, etc).

Use cases for this Goodreads data API

📚 Author platforms, book promotion, indie publishing

Tools like Reedsy, BookBub, BookFunnel and indie publishing platforms need fresh rating/review metrics for every book they promote. Schedule this scraper weekly to refresh your reviews-engine.

🛒 Book discovery / recommendation apps

Train collaborative-filtering models on Goodreads ratings or build a "Books like X" feature by pulling all books from canonical lists ("Best Mystery", "Best of 2025") and ranking by rating × ratingsCount.

🎓 Academic literary research

Researchers studying genre evolution, demographic reading patterns or literary canon formation use Goodreads as primary corpus. Bulk-extract one book per year per genre and feed into your analysis pipeline.

📊 Publishing house competitive intelligence

Knowing the rating curve of every Stephen King vs every Dean Koontz vs every Lee Child release lets editors and marketing teams price advances, plan releases and pick mid-list bets.

🤖 LLM training data + RAG pipelines

Build a book-aware AI assistant that knows ISBN, page count, average rating and category for every published title — and can recommend books based on user preferences with grounded data.

Subscriptions like "5-Bullet Book Brief" use book data to build reading lists for paid subscribers. This scraper feeds your CMS with consistent metadata.

📈 Financial / market analysis

Hedge funds tracking the "audiobook revolution" or "Kindle Unlimited churn" use Goodreads engagement metrics (ratings velocity, review counts) as leading indicators for traditional publisher earnings.

How to use this Goodreads scraper

Three input modes — combine them freely in a single run.

Mode 1: Book URLs

Pass canonical book URLs to extract one full record per book.

{
  "bookUrls": [
    "https://www.goodreads.com/book/show/16299.And_Then_There_Were_None",
    "https://www.goodreads.com/book/show/40961427-educated"
  ]
}

Mode 2: Author URLs

Pass author profile URLs to extract author identity plus visible book list.

{
  "authorUrls": [
    "https://www.goodreads.com/author/show/123715.Agatha_Christie",
    "https://www.goodreads.com/author/show/16667.Isaac_Asimov"
  ]
}

Mode 3: List URLs (with optional enrichment)

Lists are curated collections — "Best Books Ever", "Pulitzer Prize Winners", "Best Science Fiction of the Decade", etc. Each list yields ~100 books per page.

{
  "listUrls": [
    "https://www.goodreads.com/list/show/1.Best_Books_Ever",
    "https://www.goodreads.com/list/show/2.Best_Books_of_the_Decade__2010s"
  ],
  "enrichBooksFromLists": true,
  "maxBooksPerList": 50
}

When enrichBooksFromLists: true, each list emits one type: "list" record plus one type: "book" record per enriched book. If you only need the URL references, leave it off and you'll get a much cheaper run.

Step-by-step tutorial — your first Goodreads run in 2 minutes

Click "Try for free" on this actor's Apify Store page. New users get $5 in credit.

Paste a starter input for the most popular Goodreads list:

{
  "listUrls": ["https://www.goodreads.com/list/show/1.Best_Books_Ever"],
  "enrichBooksFromLists": true,
  "maxBooksPerList": 20,
  "maxTotalItems": 25
}

Click "Start" and watch the live log.
Download your dataset as JSON, CSV, Excel, RSS or HTML.

You'll get one list record + 20 fully enriched book records (ISBN, ratings, page count) in ~30 seconds.

Performance and cost

HTTP only — no Playwright, no proxy, runs on minimal Apify compute units.
4–6 items per second sustained, single worker, 256 MB memory.
Pricing: $0.005 per item + $0.00005 per actor start.

Pricing scenarios

Workload	Items	Cost
Try the actor	5 books	$0.025
One Apify free $5 credit	~1,000 items	$5.00
Full enrich of a 100-book list	101 items	$0.51
Top 10 lists × 100 books × enrich	1,010 items	$5.05
Author + their 30 visible books × 50 authors	1,550 items	$7.75

Output example (single book, JSON)

{
  "type": "book",
  "bookId": "16299",
  "slug": "And_Then_There_Were_None",
  "url": "https://www.goodreads.com/book/show/16299.And_Then_There_Were_None",
  "title": "And Then There Were None",
  "authors": [
    {"name": "Agatha Christie", "url": "https://www.goodreads.com/author/show/123715.Agatha_Christie"}
  ],
  "isbn": "9780312330873",
  "numberOfPages": 264,
  "bookFormat": "Paperback",
  "language": "English",
  "description": "First, there were ten—a curious assortment of strangers...",
  "coverImage": "https://m.media-amazon.com/images/S/compressed.photo.goodreads.com/books/1638425885i/16299.jpg",
  "rating": 4.27,
  "ratingsCount": 1662794,
  "reviewsCount": 86031,
  "bestRating": 5.0,
  "worstRating": 1.0,
  "scrapedAt": "2026-05-18T20:34:12+00:00"
}

How this Goodreads scraper compares

Approach	Pros	Cons
This actor	Stable JSON-LD selectors, $5/1K, no proxy, 3 modes	No full review text extraction in v1
Goodreads legacy API	Was free	Shut down by Amazon in late 2020 — no longer accessible
Open Library API	Free	Sparse coverage, missing ratings, no Goodreads-specific metrics
Manual scraping with BeautifulSoup	Total flexibility	Selectors break with every Goodreads UI update; you maintain forever
Hiring a freelancer	Custom output	$300–$1,000 one-off; not maintained

How to call this Goodreads scraper from your code

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("makework36/goodreads-scraper").call(run_input={
    "listUrls": ["https://www.goodreads.com/list/show/1.Best_Books_Ever"],
    "enrichBooksFromLists": True,
    "maxBooksPerList": 100,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item["type"] == "book":
        print(item["title"], item["rating"], item["isbn"])

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('makework36/goodreads-scraper').call({
  bookUrls: ['https://www.goodreads.com/book/show/16299.And_Then_There_Were_None'],
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(b => console.log(b.title, b.rating, b.numberOfPages));

Frequently Asked Questions about scraping Goodreads

Is scraping Goodreads legal?

This actor extracts metadata Goodreads itself renders publicly to every visitor on book, author and list pages, and that is embedded as JSON-LD specifically to help search engines and aggregators consume the same data. You are still responsible for how you use it — respect copyright (descriptions and cover art remain Amazon/publisher property), Goodreads' Terms of Service for derivative-product creation, and applicable data protection laws.

Why not use the official Goodreads API?

Amazon shut down the legacy Goodreads API in late 2020 and never published a replacement. JSON-LD scraping is currently the only programmatic path to fresh Goodreads data.

Does the scraper extract full review text?

Not in v1. Aggregate counts (ratingsCount, reviewsCount) plus average rating come from JSON-LD. Individual review text is on a separately rendered page; an enhancement is planned in v1.1.

How current is the rating data?

Live — every run hits Goodreads directly and reflects the rating as displayed at request time. There is no internal cache.

What if a book has no ISBN listed?

Goodreads stores the ISBN of the default edition for each work. Some older or self-published books have no canonical ISBN — the field returns null. ISBN-13 can usually be derived from the canonical edition URL slug.

Can I scrape every book by an author?

Pass the author URL, and the actor returns the visible books on their profile page (typically 30 entries). For deeper coverage, scrape the author's "Books" tab URL pattern: https://www.goodreads.com/author/list/{authorId} (coming in v1.1).

Does Goodreads block bots?

The book, author and list endpoints do not actively challenge bot traffic as of this release. The /search?q=... endpoint does return HTTP 202 to non-cookied requests, which is why this actor does not offer keyword search mode. Use lists or direct URLs instead.

How do I find list URLs?

Visit https://www.goodreads.com/list and browse by genre, decade, or theme. Copy the URL of any list. Popular starters: /list/show/1.Best_Books_Ever, /list/show/2.Best_Books_of_the_Decade__2010s, /list/show/43.Best_Science_Fiction_Fantasy_Books.

Can I schedule this scraper?

Yes. Use Apify's built-in scheduler to refresh your dataset daily, weekly or monthly, and push results directly to Google Sheets, BigQuery, Postgres or your CMS via Apify integrations.

Will my dataset have duplicates?

The actor deduplicates by URL within a single run. Across runs, build a primary key on bookId / authorId / listId to merge cleanly.

How accurate is the page count?

Page count reflects the default edition. A Kindle edition may show a different page count than the paperback. Use bookFormat to disambiguate.

Is there a free trial?

Apify gives every new user $5 in platform credit, enough to extract ~1,000 Goodreads items with this actor.

Can I use this data to build a recommendation engine?

Absolutely. Many recommender-system projects start with a Goodreads books CSV. Combining bookId, authors, numberOfPages, language, rating, ratingsCount and description gives you a rich feature matrix for collaborative filtering or content-based recommendations.

🔗 Other actors by makework36

Building a content, publishing or recommendation product? You'll also want these:

Shopify Products Scraper — full Shopify catalog: title, SKU, price, variants, inventory
IndiaMART Suppliers Scraper — India B2B suppliers with phone, GST verified & ratings
Email Finder Scraper — verified business emails by domain
Reddit SaaS Leads Scraper — startup pain points & buyers
Trustpilot Reviews Scraper — customer reviews & ratings
YouTube Shorts Scraper — short-form video creator data

See all actors by makework36 on the Apify Store.

Roadmap

v1.1: full review text extraction per book, deeper author bibliography via /author/list/{id} pagination.
v1.2: book genre/shelf hierarchy extraction.
v1.3: ISBN-to-book reverse lookup via /search/?q={isbn}&search_type=isbn.
v2: similar-books graph extraction (for recommendation pipelines).

Disclaimer

This actor extracts public book and author metadata that Goodreads itself renders to every visitor and embeds as JSON-LD for SEO consumption. You are responsible for how you store, transform and redistribute the data. Cover images and book descriptions remain the property of their original publishers. This actor is not affiliated with Goodreads or Amazon.

🙏 Ran this Goodreads scraper successfully? Leaving a review helps the Apify algorithm surface this actor to other book platforms and publishing teams. Much appreciated.

Goodreads Books, ISBN & Reviews Scraper API

thescrapelab/Apify-Goodreads-Scraper

Scrape Goodreads books, ISBN lookups, reviews, ratings, authors, series, genres, and book metadata from URLs or searches into clean JSON datasets.

Inus Grobler

Goodreads Books Scraper

gio21/goodreads-scraper

Search and scrape books from Goodreads by keyword or list URL. Extract title, author, rating, review count, pages, ISBN, genres, and description. Pay per result.

Gio

Goodreads Books Scraper

moving_beacon-owner1/goodreads-books-scraper

Extract detailed metadata from Goodreads for books, including the title, authors, average rating, total ratings count, ISBN number, and page count. Additionally, incorporate a search functionality to find specific titles or authors within the Goodreads database.

Jamshaid Arif

Google Books Scraper — Books, ISBN & Prices

ponderable_hydrometer/google-books-scraper

Search Google Books — title, authors, ISBN, categories, ratings, page count, price & preview links. Query by title/author/subject/ISBN. Bring your own free API key.

Ponderable Hydrometer

Goodreads Reviews Scraper

scraped/goodreads-review-scraper

Scrape reviews for books on Goodreads

scraped

Goodreads Scraper - Books, Lists, Ratings & Author Data

antishock/goodreads-books-lists-scraper

Scrape Goodreads books, reading lists, author profiles and user shelves. Get title, ISBN, rating, review count, genres, description and author bio. Ideal for publishing research and book discovery.

Ryan Zinburg

Goodreads Book Scraper

sian.agency/goodreads-book-scraper

Scrape books from Goodreads by search or book page — title, author, average rating, ratings & reviews count, ISBN/ISBN13, ASIN, pages, publisher, language, genres, series and description. Clean JSON/CSV, no code.

SIÁN OÜ

Goodreads Scraper — Books, Reviews & Authors

khadinakbar/goodreads-all-in-one-scraper

Scrape Goodreads books, reviews, authors, lists, series, and search results from any URL or text query. MCP-ready, all-in-one, residential proxy default, $0.005 per result.

Khadin Akbar

Goodreads Book Scraper - Ratings & Reviews

lulzasaur/goodreads-books-scraper

Scrape book data from Goodreads. Search by title or author. Extract ratings, reviews, page count, ISBN, genres, description, author info, and similar books from the world's largest book community.

lulz bot