Goodreads Scraper
Pricing
from $0.90 / 1,000 goodreads books
Goodreads Scraper
Scrape public Goodreads books from search terms, book URLs, shelves, genres, lists, and author pages. Export ratings, authors, ISBNs, descriptions, covers, source ranks, and optional review rows.
Pricing
from $0.90 / 1,000 goodreads books
Rating
0.0
(0)
Developer
Maxime Dupré
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
19 hours ago
Last modified
Categories
Share
📚 Goodreads scraper for books, shelves, and reviews
Goodreads Scraper collects public book data from Goodreads search terms, ISBNs, book URLs, shelves, genres, lists, search pages, author pages, and series pages. Add a Goodreads URL, ISBN, or a query such as science fiction, then export book titles, authors, ratings, review counts, descriptions, ISBNs, genres, covers, and scrape metadata from a clean Apify dataset.
Use this Goodreads scraper when you need repeatable book data for market research, reading-list curation, publishing research, recommendation datasets, SEO briefs, content planning, or internal book intelligence workflows. You can run it in Apify Console, schedule repeat runs, call it from the Apify API, and export results as JSON, CSV, Excel, XML, RSS, or HTML.
The Actor is built for public Goodreads pages. It does not require a Goodreads account, cookies, Goodreads API key, or user credentials. It only saves usable scraped results; invalid targets, unavailable pages, and source-side problems are reported in logs instead of placeholder dataset rows.
✅ What this Actor does
- Scrapes Goodreads book results from search terms.
- Looks up books by ISBN-10 or ISBN-13.
- Scrapes public Goodreads book, shelf, genre, list, search, author, and series URLs.
- Saves one dataset item per discovered book.
- Optionally saves public review rows found on book pages.
- Optionally saves author, list, and series metadata rows.
- Filters books by rating, ratings count, publish year, language, genre, and author.
- Opens book pages for detail enrichment when
includeBookDetailsis enabled. - Keeps source URL, source type, source rank, and scrape time on every book row.
- Deduplicates books across all submitted searches and URLs by default.
- Stops at your
maxItems,maxPages, andmaxReviewsPerBooklimits.
For keyword search, the Actor uses Goodreads public autocomplete results. For shelf, genre, list, search URL, and author targets, it discovers book links from the submitted Goodreads pages and then enriches each book from its public book page.
📦 Data you can extract
Each book output item can include:
goodreadsId,url, andcanonicalUrltitleauthorswith author name, URL, and Goodreads ID when availablerating,ratingCount, andreviewCountdescriptionimageUrlisbnandisbn13publisher,publishedDate, andfirstPublishedDatepageCount,bookFormat, andlanguageseries,genres,characters, andawardsbuyLinkswhen availablesourceTargetUrl,sourceTargetType,sourcePage, andsourceRankstatus,missingFields, andscrapedAt
When includeReviews is enabled and maxReviewsPerBook is greater than 0, the Actor can also save review rows with the source book URL, visible reviewer profile data, visible review text, HTML, shelves when available, and scrape time.
When includeMetadata is enabled, author, list, and series URL targets can also emit metadata rows with metadataType, source URL, title, description, book count when visible, and scrape time.
Some Goodreads pages do not expose every field for every book. Missing scalar values are returned as null, and missing lists are returned as empty arrays. The Actor does not invent metadata that is not visible in the public source page.
🚀 How to run
- Add one or more Goodreads search terms, ISBNs, Goodreads URLs, or a mix of all three.
- Keep
Book limitat25or50for a quick first run. - Keep
Page limit per targetlow while testing shelf, genre, list, or author pages. - Leave
Include book detailson if you want descriptions, ISBNs, page counts, genres, ratings, and review counts. - Turn on
Include reviewsonly when you need review rows, then set a smallReview limit per book. - Run the Actor and open the dataset in Apify Console, export it, or pull it through the Apify API.
For the fastest first run, use the prefilled science fiction search term or the prefilled https://www.goodreads.com/shelf/show/fantasy shelf URL with a small book limit.
⚙️ Input
{"searchTerms": ["science fiction"],"isbns": ["9780062316097"],"targets": [{"url": "https://www.goodreads.com/series/45175-harry-potter"}],"maxItems": 25,"maxPages": 2,"includeBookDetails": true,"includeReviews": false,"maxReviewsPerBook": 0,"includeMetadata": true,"minRating": 3.5,"minRatingsCount": 10,"deduplicateBooks": true}
🔎 Search terms
Use plain Goodreads book searches such as:
science fictionromantasyStephen Kingbusiness bookshistorical fiction
Each search term is searched separately. Empty and duplicate terms are ignored.
🔢 ISBNs
Use ISBN-10 or ISBN-13 values such as:
9780062316097978-0-7432-7356-50747532699
Hyphens and spaces are accepted. Each ISBN is searched on Goodreads and saved as a book row when Goodreads returns a public match.
🔗 Goodreads URLs
You can submit public Goodreads URLs such as:
- Book pages:
https://www.goodreads.com/book/show/5907.The_Hobbit - Shelf pages:
https://www.goodreads.com/shelf/show/fantasy - Genre pages:
https://www.goodreads.com/genres/most_read/fantasy - List pages:
https://www.goodreads.com/list/show/1.Best_Books_Ever - Series pages:
https://www.goodreads.com/series/45175-harry-potter - Search pages, author pages, and author book-list pages
The Actor skips unsupported or unavailable targets instead of saving fake rows.
🎚️ Limits and options
maxItems caps saved book rows across the whole run. maxPages caps listing pages scanned for each Goodreads URL target. includeBookDetails controls whether the Actor opens each book page for enriched fields. includeReviews and maxReviewsPerBook control optional review row output. includeMetadata controls optional author/list/series summary rows. deduplicateBooks keeps the same Goodreads book from being saved more than once in a run.
Filters apply when the relevant field is visible on Goodreads: minRating, minRatingsCount, publishYearMin, publishYearMax, language, containsGenre, and containsAuthor.
🧪 Output example
{"rowType": "book","goodreadsId": "5907","url": "https://www.goodreads.com/book/show/5907.The_Hobbit","canonicalUrl": "https://www.goodreads.com/book/show/5907.The_Hobbit","title": "The Hobbit, or There and Back Again","authors": [{"name": "J.R.R. Tolkien","url": "https://www.goodreads.com/author/show/656983.J_R_R_Tolkien","goodreadsId": "656983"}],"rating": 4.29,"ratingCount": 4510933,"reviewCount": 74706,"description": "Bilbo Baggins is a hobbit who enjoys a comfortable, unambitious life...","imageUrl": "https://m.media-amazon.com/images/S/compressed.photo.goodreads.com/books/1546071216i/5907.jpg","isbn": "9780547928227","isbn13": null,"publisher": null,"publishedDate": null,"firstPublishedDate": "First published September 21, 1937","pageCount": 366,"bookFormat": "Paperback","language": "English","series": null,"genres": ["Fantasy", "Classics", "Fiction"],"characters": [],"awards": [],"buyLinks": [],"sourceTargetUrl": "https://www.goodreads.com/shelf/show/fantasy","sourceTargetType": "shelf","sourcePage": 1,"sourceRank": 1,"status": "ok","missingFields": [],"scrapedAt": "2026-05-27T19:17:00.000Z"}
💳 Pricing
This Actor uses pay-per-event pricing. You are charged for each saved Goodreads book. If you enable review scraping, each saved Goodreads review is charged separately. If you enable author/list/series metadata, each saved metadata row is charged separately.
Book and metadata pricing by Apify tier:
| Tier | Price per 1,000 |
|---|---|
| FREE | $1.80 |
| BRONZE | $1.50 |
| SILVER | $1.15 |
| GOLD | $0.90 |
| PLATINUM | $0.90 |
| DIAMOND | $0.90 |
Review rows are $0.90 per 1,000.
Review scraping can create many more rows than book scraping, so keep maxReviewsPerBook small for your first review run.
⚠️ Limits and caveats
- The Actor works with public Goodreads data only.
- It does not scrape private Goodreads data, logged-in-only pages, or account-specific shelves.
- Review extraction is best for visible public review cards on book pages; it does not guarantee every review for books with large review histories.
- Source pages can change, omit fields, or temporarily return no usable data. In those cases, the Actor saves available book rows and reports problems in logs.
- Large runs over shelves, genres, lists, or authors should use bounded limits first, then scale after you inspect the output shape.
❓ FAQ
🔐 Do I need a Goodreads account?
No. The Actor is designed for public Goodreads pages and does not ask for Goodreads cookies, credentials, or an API key.
💬 Can I scrape Goodreads reviews?
Yes. Turn on Include reviews and set Review limit per book above 0. Review rows are separate from book rows so CSV and spreadsheet exports stay easier to work with.
🔗 Can I use direct Goodreads book URLs?
Yes. Add public Goodreads book URLs to Goodreads URLs. You can also add shelves, genres, lists, search pages, and author book-list pages.
🧩 Why are some fields null?
Goodreads does not expose every field on every public page. The Actor keeps missing values as null or empty arrays rather than guessing.
🗓️ Can I schedule this Goodreads scraper?
Yes. After you save an input, you can schedule repeat Apify runs, call the Actor through the Apify API, or connect results to downstream tools with Apify integrations and webhooks.
📝 Changelog
- 0.2: Added ISBN lookup, series URLs, optional author/list/series metadata rows, book filters, and lower tiered pay-per-event pricing.
- 0.1: Initial release.
🆘 Support
For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡
🔗 Other actors
- Product Hunt Scraper ↗ - Find startup launches, ranks, followers, reviews, and optional website details.
- Reddit Scraper ↗ - Search Reddit posts and comments for research, monitoring, and conversation tracking.
- Quora Search Scraper ↗ - Find public Quora questions from search terms or direct question URLs.
- Website URL Crawler ↗ - Crawl rendered websites and export link maps for SEO, QA, and audits.
- Unsplash Image Scraper ↗ - Collect Unsplash image search results with source ranks and image URLs.
Made with ❤️ by Maxime Dupré