Pricing

from $1.00 / 1,000 results

Try for free

Go to Apify Store

Goodreads Book Scraper

Try for free

Extract book data from Goodreads: titles, authors, ratings, reviews, genres, ISBN, publisher, and more. HTTP-based, no proxy required.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(23)

Developer

Crawler Bros

Actor stats

Bookmarked

Total users

Monthly active users

13 hours

Issues response

2 days ago

Last modified

Goodreads Scraper

Scrape Goodreads — books, authors, series, Listopia lists, popular shelves, and genres. Look up books by direct URL, search query, or ISBN. Get titles, authors, ratings, reviews, genres, ISBN-10/13, publisher, page count, language, format, cover image, and more. HTTP-based via the public goodreads.com pages. No proxy required, no authentication.

New in v1.1: ISBN lookup, author/series/list/shelf/genre modes, omit-empty output (no null fields), retry layer, optional metadata records.

What this actor does

9 modes: auto (default), books, search, isbns, authors, series, lists, shelves, genres.
ISBN lookup — paste ISBN-10 or ISBN-13; Goodreads' search redirects to the matching book.
Listings — author / series / list / shelf / genre pages walked (paginated where applicable) and every book scraped in detail.
Filters — minimum rating, minimum ratings count, publish year range, language, contains-genre, contains-author.
Metadata records — optionally emit per-author / per-series / per-list summary records (bio, description, book count) alongside book records.
Empty fields are omitted — records never contain null, "", [], or {}.

Quick start

The default mode: "auto" runs every populated input array. Provide what you have:

{
  "mode": "auto",
  "bookUrls": ["https://www.goodreads.com/book/show/4671.The_Great_Gatsby"],
  "searchQueries": ["sapiens"],
  "maxItems": 5
}

Expected output: 1 book record (Gatsby) + up to 5 books from the Sapiens search.

Modes

`auto` (default)

Runs whichever input arrays are non-empty. Fully backward-compatible with v1.0 (which only had bookUrls + searchQueries).

`books` — direct URLs

{
  "mode": "books",
  "bookUrls": [
    "https://www.goodreads.com/book/show/4671.The_Great_Gatsby",
    "https://www.goodreads.com/book/show/23692271-sapiens"
  ]
}

`search` — text queries

{
  "mode": "search",
  "searchQueries": ["atomic habits", "the lord of the rings"],
  "maxItems": 20
}

`isbns` — ISBN-10 or ISBN-13

{
  "mode": "isbns",
  "isbns": ["9780743273565", "0747532699", "9780261103252"]
}

Hyphens and spaces in ISBN strings are tolerated ("978-0-7432-7356-5" works). Goodreads redirects ISBN searches directly to the matching book — no extra fetch needed.

`authors` — all books by an author

{
  "mode": "authors",
  "authorUrls": ["https://www.goodreads.com/author/show/1077326.J_K_Rowling"],
  "maxItems": 20,
  "includeMetadata": true
}

Walks /author/list/<id>?page=N for every book. With includeMetadata: true, also emits one author summary record (bio, photo, genres, born/died) before the books.

`series` — all books in a series

{
  "mode": "series",
  "seriesUrls": ["https://www.goodreads.com/series/49075-harry-potter"],
  "includeMetadata": true
}

`lists` — Listopia curated lists

{
  "mode": "lists",
  "listUrls": ["https://www.goodreads.com/list/show/1.Best_Books_Ever"],
  "maxItems": 50
}

Paginated automatically. Try "https://www.goodreads.com/list/show/264.Books_That_Should_Be_Made_Into_Movies" or any Listopia URL.

`shelves` — popular shelf names

{
  "mode": "shelves",
  "shelfNames": ["mystery", "fantasy", "historical-fiction"],
  "maxItems": 30
}

Shelf names are case-insensitive; spaces are converted to hyphens (e.g. "Best Mystery 2024" → "best-mystery-2024").

`genres` — top books in a genre

{
  "mode": "genres",
  "genreNames": ["fiction", "romance", "mystery"]
}

Filters

All filters are optional. They apply across every mode. Records missing the filtered field pass through (filters reject only when the field is present and out of bounds).

Filter	Type	Effect
`minRating`	number 0–5	Drop books with `averageRating` below this
`minRatingsCount`	int	Drop books with fewer ratings than this
`publishYearMin`	int	Drop books published before this year
`publishYearMax`	int	Drop books published after this year
`language`	string	Records whose `language` starts with this string (case-insensitive) — e.g. `"en"` matches `"en"` / `"eng"` / `"English"`
`containsGenre`	string	Drop books unless one of their genres contains this substring (case-insensitive)
`containsAuthor`	string	Drop books unless one of their authors contains this substring (case-insensitive)

Example: highly-rated recent fantasy

{
  "mode": "shelves",
  "shelfNames": ["fantasy"],
  "minRating": 4.0,
  "minRatingsCount": 10000,
  "publishYearMin": 2020,
  "maxItems": 50
}

Output fields per record type

Every record has recordType: "book" | "author" | "series" | "list" and scrapedAt (ISO 8601 UTC).

Book record (`recordType: "book"`)

Field	Description
`title`	Book title
`url`	Goodreads book URL
`bookId`	Goodreads numeric book ID
`authors[]`	Author names
`primaryAuthor`	First author
`authorUrls[]`	Goodreads author profile URLs
`description`	Plain-text description (HTML stripped)
`isbn`, `isbn10`, `isbn13`	ISBN identifiers (when known)
`averageRating`	Average rating, 0–5
`ratingsCount`	Total number of ratings
`reviewsCount`	Total number of text reviews
`pagesCount`	Page count
`publishedYear`	Year of original publication
`publisher`	Publisher name
`language`	Language (varies — sometimes ISO code, sometimes name)
`format`	Paperback, Hardcover, Kindle, etc.
`genres[]`	List of genre tags
`coverImage`	Cover image URL on Goodreads CDN

Author record (`recordType: "author"`, only when `includeMetadata: true`)

Field	Description
`name`	Author display name
`authorId`, `authorUrl`	Goodreads identifiers
`photoUrl`	Author photo on Goodreads CDN
`description`	"About the author" text
`born`, `died`	Birth/death info (when public)
`genres[]`	Top author genres
`website`	External author website (when listed)

Series record (`recordType: "series"`, only when `includeMetadata: true`)

Field	Description
`name`	Series name
`seriesId`, `seriesUrl`	Goodreads identifiers
`description`	Series description
`primaryAuthor`	First author of the series
`bookCount`	Number of books in the series page

List record (`recordType: "list"`, only when `includeMetadata: true`)

Field	Description
`name`	List name
`listId`, `listUrl`	Goodreads identifiers
`description`	List description
`bookCount`	Total books in the list
`voterCount`	Total voters

Use cases

Library systems — bulk-import metadata from Goodreads by ISBN.
Reading recommendation — feed Goodreads genre + rating data into your recommender.
Author catalog — get every book by a specific author in one run.
Series tracking — pull all books in a series with publication years and ratings.
Curated discovery — scrape Listopia lists like "Best Books of the Decade" or "Best Mystery 2024".
Reading-level filtering — only books rated ≥4.0 with ≥10k ratings published in the last 5 years.
Publishing intelligence — track ratings/reviews velocity for a series of releases.

FAQ

Why was ISBN lookup added in v1.1? A user reported that v1.0 had no documented path to look up a book by ISBN. v1.1 ships a dedicated isbns input that accepts ISBN-10 / ISBN-13 with or without hyphens. Goodreads' search endpoint redirects ISBN queries to the matching book page, so lookup is direct.

Is a proxy required? No. Goodreads' public pages are accessible from datacenter IPs. The actor includes an optional proxyConfiguration field for cases where you hit sustained 429s, but the default is no proxy.

What's the rate limit? The actor uses 0.3–0.7s polite delays between fetches. If Goodreads rate-limits, the actor retries with exponential backoff (10s/20s/40s, capped at 90s) up to 3 times per fetch.

Why is includeMetadata opt-in? By default the dataset is uniform (all recordType: "book"). Enabling includeMetadata mixes in author / series / list records — useful for analysis but breaks consumers expecting only books. Off by default for backward compatibility.

Why does some coverImage URL return 404? Goodreads sometimes references covers that aren't in their CDN (very old or rare books). The URL is what Goodreads publishes; not all of them resolve.

What does mode: "auto" mean? It runs every populated input array sequentially. This is the default, and it preserves v1.0 behavior — pre-v1.1 callers passing only bookUrls and searchQueries continue to work unchanged.

What's the difference between language: "en" and language: "English"? Both match. The filter is a case-insensitive prefix match — "en" matches "en", "eng", and "English" (which all start with "en"). "English" matches only "English" exactly (and any value starting with "English").

Can I use ISBN-13 or ISBN-10? Both. The actor normalizes by stripping non-alphanumerics; hyphens and spaces in your input are fine.

Is this affiliated with Goodreads or Amazon? No. This is a third-party actor that uses Goodreads' public pages.

Limitations (v1.1)

Reviews-detail pages (/review/show/<id>) are not scraped. The book's reviewsCount is captured, but individual review text is not. Planned for v2.
Award pages (/award/show/<id>) have inconsistent layouts and are not supported. Planned for v2.
Quotes (/quotes) are a separate record type and not in scope for v1.1.
User shelves (/user/show/<id>) are not supported — most are login-gated; public shelves duplicate /list/show.
No country localization — Goodreads runs on goodreads.com globally with no country subdomains.
mode=shelves is capped at ~50 books by Goodreads — /shelf/show/<name>?page=N ignores the page parameter; pages 2+ return identical content. The actor detects this and stops cleanly. For deeper coverage, use mode=lists (Listopia paginates correctly to thousands of books) or mode=genres (different surface).

Changelog

v1.1 (current)

NEW: isbns mode (the v1.0 ISBN gap is closed).
NEW: authors, series, lists, shelves, genres modes.
NEW: filters — minRating, minRatingsCount, publishYearMin/Max, language, containsGenre, containsAuthor.
NEW: optional metadata records via includeMetadata.
NEW: retry layer on 429/5xx with exponential backoff.
FIX: omit-empty contract — records no longer contain null, "", [], or {}.
DEPRECATED: maxResultsPerQuery (use maxItems instead — both honored).
BACKWARD-COMPAT: v1.0 callers passing only bookUrls + searchQueries continue to work via mode: "auto".

v1.0

Initial release: bookUrls, searchQueries, maxResultsPerQuery.

Reddit MCP Scraper

crawlerbros/reddit-mcp-scraper

Unified Reddit scraper supporting 3 modes: (1) Subreddit posts with content extraction, (2) Post comments with threading, (3) User profiles with metadata. Extract comprehensive data including scores, timestamps, flairs, NSFW flags, and more.

Crawler Bros

4.9

Wikipedia Article Scraper

crawlerbros/wikipedia-scraper

Extract structured data from Wikipedia articles. Get summaries, categories, images, metadata, and descriptions using Wikipedia's official API. Supports 300+ languages.

Crawler Bros

5.0

Mobile.de Car Scraper

crawlerbros/mobile-de-scraper

Extract used car listings from mobile.de with make, model, price, mileage, year, fuel, transmission, dealer info, and photos.

Crawler Bros

5.0

Canva Template Scraper

crawlerbros/canva-scraper

Extract Canva template metadata including title, description, image, colors palette, fonts, size, creator, category, and breadcrumbs. Works on any public Canva template URL.

Crawler Bros

5.0

Reddit Profile Crawler

crawlerbros/reddit-profile-crawler

Scrape reddit's profiles with posts and profile information.

Crawler Bros

4.9

ZipRecruiter Jobs Scraper

crawlerbros/ziprecruiter-scraper

Extract job postings from ZipRecruiter.com including title, company, location, salary range, city, state, and apply URL. Walks paginated search results without proxy or login.

Crawler Bros

4.5

Flippa Scraper

crawlerbros/flippa-scraper

Scrape digital asset listings from Flippa.com including websites, ecommerce stores, SaaS, apps, and domains. Extract price, revenue, profit, traffic, verification, seller, and industry data.

Crawler Bros

5.0

OSINT Scraper

crawlerbros/osint-scraper

Search paste sites and code sharing platforms (Pastebin, GitHub Gist, Ideone, Paste.org, Textbin) for leaked keywords, credentials, and sensitive data using Google SERP-based discovery.

Crawler Bros

5.0

Rightmove UK Property Scraper

crawlerbros/rightmove-scraper

Extract UK property listings from Rightmove.co.uk with price, address, bedrooms, bathrooms, coordinates, photos, key features, agent, and more. Supports for-sale and to-rent searches with full filter pass-through.

Crawler Bros

5.0

Bing Search Scraper

crawlerbros/bing-search-scraper

Scrape Bing search results. Extract titles, URLs, descriptions, and snippets for any search query with market/language targeting.

Crawler Bros

5.0

Goodreads Book Scraper

Goodreads Scraper

What this actor does

Quick start

Modes

auto (default)

books — direct URLs

search — text queries

isbns — ISBN-10 or ISBN-13

authors — all books by an author

series — all books in a series

lists — Listopia curated lists

shelves — popular shelf names

genres — top books in a genre

Filters

Example: highly-rated recent fantasy

Output fields per record type

Book record (recordType: "book")

Author record (recordType: "author", only when includeMetadata: true)

Series record (recordType: "series", only when includeMetadata: true)

List record (recordType: "list", only when includeMetadata: true)

Use cases

FAQ

Limitations (v1.1)

Changelog

v1.1 (current)

v1.0

You might also like

Reddit MCP Scraper

Wikipedia Article Scraper

Mobile.de Car Scraper

Canva Template Scraper

Reddit Profile Crawler

ZipRecruiter Jobs Scraper

Flippa Scraper

OSINT Scraper

Rightmove UK Property Scraper

Bing Search Scraper

`auto` (default)

`books` — direct URLs

`search` — text queries

`isbns` — ISBN-10 or ISBN-13

`authors` — all books by an author

`series` — all books in a series

`lists` — Listopia curated lists

`shelves` — popular shelf names

`genres` — top books in a genre

Book record (`recordType: "book"`)

Author record (`recordType: "author"`, only when `includeMetadata: true`)

Series record (`recordType: "series"`, only when `includeMetadata: true`)

List record (`recordType: "list"`, only when `includeMetadata: true`)