Goodreads Books Scraper
Pricing
Pay per usage
Goodreads Books Scraper
Efficiently extract detailed book data with the Goodreads Books Scraper. Ideal for building reading lists or analyzing metadata. Note: For bulk scraping of more than 50 books, providing JSON cookies is essential to ensure seamless access and reliable results.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Shahid Irfan
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Goodreads Book Scraper
Extract comprehensive book data from Goodreads shelves including titles, authors, ratings, reviews, descriptions, ISBNs, genres, and publication details. Perfect for book analysis, market research, reading list creation, and literary data collection.
What does the Goodreads Book Scraper do?
The Goodreads Book Scraper enables you to extract detailed book information from any Goodreads shelf or category. Whether you're building a reading recommendation system, conducting market research, or creating a personal book database, this scraper provides all the data you need.
Key capabilities:
- ๐ Extract book details - Titles, authors, ratings, review counts, descriptions, and more
- ๐ Automatic pagination - Seamlessly navigate through multiple pages of results
- โก Fast & efficient - Lightweight design optimized for speed and reliability
- ๐ Structured data - Clean JSON output ready for analysis or integration
- ๐ฏ Flexible targeting - Scrape any Goodreads shelf by name or URL
- ๐ Two scraping modes - Quick overview or detailed book information
Why scrape Goodreads?
Goodreads is the world's largest community of book lovers with over 90 million members and data on millions of books. Access to this data enables:
- Market research - Analyze book trends, popular genres, and reader preferences
- Recommendation systems - Build personalized book recommendation engines
- Content curation - Create reading lists and book collections
- Price monitoring - Track book popularity for inventory decisions
- Academic research - Study reading patterns and literary trends
- Personal libraries - Organize and manage your reading lists
How much does it cost to scrape Goodreads?
The cost depends on the number of books you scrape and whether you enable detailed scraping. Here are typical usage estimates:
- 100 books (basic) - ~0.01-0.02 Apify compute units
- 100 books (detailed) - ~0.03-0.05 Apify compute units
- 1,000 books (detailed) - ~0.30-0.50 Apify compute units
Apify provides 5 USD of free credits monthly, enough to scrape thousands of books. For larger projects, paid plans start at $49/month.
Input configuration
Configure the scraper using these parameters:
Basic settings
| Start URL | Direct URL to a Goodreads shelf (e.g., https://www.goodreads.com/shelf/show/fantasy) |
| Shelf Name | Name of the shelf to scrape (e.g., fantasy, science-fiction, bestsellers) |
| Maximum Books | Number of books to scrape (default: 100) |
| Maximum Pages | Safety limit on pages to visit (default: 10) |
Advanced settings
| Collect Details | Enable to extract full book information including descriptions, ISBNs, and genres (default: enabled) |
| Cookies | Authentication cookies for accessing paginated results (required for pages beyond the first) |
| Proxy Configuration | Proxy settings (residential proxies recommended) |
Example input
{"shelf": "fantasy","results_wanted": 100,"max_pages": 5,"collectDetails": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Output format
The scraper provides structured JSON data for each book:
Basic output (without detailed scraping)
{"title": "The Name of the Wind","author": "Patrick Rothfuss","rating": 4.52,"ratingCount": 985432,"reviewCount": 45678,"image": "https://i.gr-assets.com/images/S/...","url": "https://www.goodreads.com/book/show/186074"}
Detailed output (with detailed scraping enabled)
{"title": "The Name of the Wind","author": "Patrick Rothfuss","rating": 4.52,"ratingCount": 985432,"reviewCount": 45678,"description": "Told in Kvothe's own voice, this is the tale of the magically gifted young man...","image": "https://i.gr-assets.com/images/S/...","isbn": "0756404746","publisher": "DAW Books","publishDate": "March 27, 2007","genres": ["Fantasy", "Fiction", "Magic", "Adventure"],"url": "https://www.goodreads.com/book/show/186074"}
Output fields
| Field | Type | Description |
|---|---|---|
| title | string | Book title |
| author | string | Primary author name(s) |
| rating | number | Average rating (0-5 scale) |
| ratingCount | number | Total number of ratings |
| reviewCount | number | Total number of reviews |
| description | string | Book description/synopsis (detailed mode only) |
| image | string | URL to book cover image |
| isbn | string | ISBN identifier (detailed mode only) |
| publisher | string | Publisher name (detailed mode only) |
| publishDate | string | Publication date (detailed mode only) |
| genres | array | List of book genres/categories (detailed mode only) |
| url | string | Goodreads book URL |
How to use the Goodreads Book Scraper
Using the Apify Console
- Navigate to the Goodreads Book Scraper on Apify
- Click Try for free
- Enter your configuration:
- Shelf name (e.g., "fantasy", "bestsellers")
- Number of books you want to scrape
- Toggle Collect Details for comprehensive data
- Click Start to begin scraping
- Download results in JSON, CSV, Excel, or HTML format
Using the Apify API
const Apify = require('apify-client');const client = new Apify.ApifyClient({token: 'YOUR_API_TOKEN',});const run = await client.actor('YOUR_USERNAME/goodreads-book-scraper').call({shelf: 'fantasy',results_wanted: 100,collectDetails: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Using as a standalone script
- Clone this repository
- Run
npm install - Configure
INPUT.jsonwith your parameters - Run
npm start
Important notes on pagination
โ ๏ธ Authentication requirement: Goodreads restricts pagination to authenticated users. Non-logged users can only access the first page (approximately 50 books).
To access multiple pages:
- Log in to Goodreads in your browser
- Open DevTools (F12) โ Network tab
- Reload the page and find a request to goodreads.com
- Copy the Cookie header from the request headers
- Paste the cookie value into the "Authentication cookies" field
The scraper will use your cookies to access paginated results. Pagination URLs follow this pattern:
https://www.goodreads.com/shelf/show/fantasy?page=2
Popular Goodreads shelves to scrape
Get started quickly with these popular shelves:
fantasy- Fantasy fiction and magicscience-fiction- Sci-fi and speculative fictionromance- Romance novelsmystery- Mystery and thriller booksyoung-adult- YA fictionclassics- Classic literaturenon-fiction- Non-fiction worksbiography- Biographies and memoirshistory- Historical worksself-help- Self-improvement booksbusiness- Business booksphilosophy- Philosophy texts
You can find more shelves by browsing Goodreads Shelves.
Scraping best practices
Performance optimization
- Set reasonable limits - Use
results_wantedto control scraping volume - Enable detailed scraping selectively - Disable if you only need basic information
- Use residential proxies - Required for accessing multiple pages
- Implement rate limiting - The scraper includes built-in concurrency controls
Data quality
- Validate output - Check that all expected fields are populated
- Handle missing data - Some books may have incomplete information
- Monitor for changes - Goodreads may update their HTML structure
Compliance
- Respect robots.txt - The scraper follows Goodreads guidelines
- Don't overload servers - Use appropriate concurrency settings
- Review Terms of Service - Ensure your use case complies with Goodreads policies
- Personal use recommended - Commercial use may require additional consideration
Troubleshooting
No books found on page 2+
Solution: You need to provide authentication cookies. See the pagination section above.
Scraper returns incomplete data
Solution: Enable "Collect Details" to fetch comprehensive book information.
Rate limiting or blocked requests
Solution: Use residential proxies and reduce concurrency if needed.
Outdated selectors
Solution: Goodreads occasionally updates their website. Contact support if selectors need updating.
Use cases
Market Research
Analyze book trends, identify popular genres, and understand reader preferences to make data-driven publishing decisions.
Recommendation Systems
Build sophisticated book recommendation engines using ratings, genres, and reader reviews.
Academic Research
Study literary trends, analyze reading patterns, and conduct research on book popularity and cultural impact.
Content Creation
Create curated reading lists, book blogs, and literary content based on comprehensive book data.
Personal Library Management
Organize your reading lists, track books to read, and manage your personal book collection.
Support
Need help? Have questions?
- Documentation: Check out the detailed Apify documentation
- Community: Join the Apify Discord
- Issues: Report bugs or request features on the GitHub repository
Related actors
Explore similar scrapers:
- Amazon Book Scraper - Extract book data from Amazon
- Barnes & Noble Scraper - Scrape B&N book listings
- Google Books Scraper - Extract data from Google Books
- Book Price Monitor - Track book prices across platforms
Built with โค๏ธ for the reading community. Happy scraping!