Goodreads Booklist Scraper
Pricing
$4.99/month + usage
Go to Apify Store

Goodreads Booklist Scraper
Scrape book data from Goodreads including titles, authors, ratings, and publication info using AWS Lambda API.
Pricing
$4.99/month + usage
Rating
0.0
(0)
Developer

ZeroBreak
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
Goodreads Booklist Scraper
Powerful Apify Actor to scrape book information from Goodreads using AWS Lambda API.
Features
- ✅ Multiple Search Queries: Scrape multiple search terms in one run
- ✅ Concurrent Processing: Process multiple queries simultaneously (up to 10 concurrent requests)
- ✅ Flexible Configuration: Custom pages and books per page for each query
- ✅ Rate Limiting: Built-in concurrent request limiting
- ✅ Error Handling: Robust error handling with detailed logging
- ✅ AWS Lambda Integration: Uses serverless Lambda function for scraping
Input
Required Fields
searchQueries (Array)
List of search terms to scrape. Can be:
- Simple strings:
["python programming", "machine learning"] - Objects with custom settings:
[{"searchTerm": "tolkien","pages": 5,"booksPerPage": 30}]
Optional Fields
pages (Integer)
- Default number of pages to scrape per search term
- Range: 1-15
- Default: 1
booksPerPage (Integer)
- Default maximum books to return per page
- Range: 1-100
- Default: 20
maxConcurrentRequests (Integer)
- Maximum number of concurrent API requests
- Range: 1-10
- Default: 5
Output
The actor stores results in the Apify dataset. Each result contains:
{"search_term": "python programming","status": "success","pages_requested": 5,"books_per_page": 15,"data": {"search_term": "python programming","total_pages_scraped": 5,"total_books_found": 75,"books": [{"id": "123456","title": "Python Crash Course","url": "https://www.goodreads.com/book/show/123456","cover_image": "https://...","authors": [{"name": "Eric Matthes","url": "https://www.goodreads.com/author/show/...","role": "Author"}],"average_rating": 4.5,"ratings_count": 12345,"publication_year": 2019,"publication_info": "published 2019","rank": 1}]}}
Example Input
Simple Example
{"searchQueries": ["python programming","machine learning","web development"],"pages": 3,"booksPerPage": 25}
Advanced Example
{"searchQueries": ["harry potter",{"searchTerm": "tolkien","pages": 10,"booksPerPage": 50},{"searchTerm": "stephen king","pages": 5,"booksPerPage": 30}],"pages": 1,"booksPerPage": 20,"maxConcurrentRequests": 3}
Usage Limits
- Pages per query: 1-15 (Lambda enforced)
- Books per page: 1-100 (Lambda enforced)
- Concurrent requests: 1-10 (Actor enforced)
- Request timeout: 300 seconds (5 minutes)
Use Cases
- 📖 Book research and analysis
- 📊 Market research for publishers
- 🎓 Academic research
- 📚 Reading list generation
- 🔍 Book discovery and recommendations
Error Handling
The actor handles various error scenarios:
- Network errors: Timeout and connection issues
- API errors: Invalid API key, rate limiting
- Lambda errors: Scraping failures
- Invalid input: Missing search queries
All errors are logged and included in the output for debugging.
Support
For issues or questions:
- Check the actor logs for detailed error messages
- Verify your environment variables are set correctly
- Ensure your Lambda function is running and accessible

