Goodreads Book Scraper avatar
Goodreads Book Scraper
Under maintenance

Pricing

$9.00 / 1,000 results

Go to Apify Store
Goodreads Book Scraper

Goodreads Book Scraper

Under maintenance

Developed by

Runtime

Runtime

Maintained by Community

Goodreads Book Scraper is an Apify Actor that extracts book details from Goodreads search results. It retrieves the title, author, rating, ratings count, published year, editions count, book URL, and cover image URL, outputting the data in structured JSON format.

5.0 (1)

Pricing

$9.00 / 1,000 results

2

30

6

Last modified

2 hours ago

Goodreads Book Scraper is an Apify Actor that collects book details from Goodreads search results. The actor uses a search term to find books, scrapes key information from the results, and stops pagination once the desired number of books has been reached.

Features

  • Searches Goodreads for books using a specified search term.
  • Scrapes key details including title, author, average rating, ratings count (with commas removed), published year, editions count, book URL (with query parameters removed), and cover image URL.
  • Supports pagination with configurable maximum pages and maximum books.
  • Stops fetching new pages once the desired number of books is reached.
  • Optionally uses the Apify Proxy for request management.
  • Make.com Optimized: Built-in rate limiting, concurrency control, and error handling for seamless workflow integration.

Input

  • searchTerm: The term to search for in Goodreads books (e.g., "architecture").
  • bookMax: The maximum number of books to scrape (continues through pages until reached).
  • useApifyProxy: A flag to determine whether to use the Apify Proxy.
  • delayBetweenRequests: Delay in milliseconds between scraping requests (default: 1000ms).
  • maxConcurrency: Maximum number of concurrent requests (default: 1, safe for Make.com).
  • timeout: Request timeout in seconds (default: 30s).
  • enableDetailedScraping: When enabled, visits each book page individually to extract comprehensive details (pages, format, description, genres, author info). Slower but much more detailed data (default: true).
  • Additional proxy configuration options are available.

Output

The actor outputs a JSON dataset where each record represents a book with the following fields:

  • title: The title of the book.
  • author: The author of the book.
  • rating: The average rating (0-5 scale).
  • ratingsCount: The number of ratings (integer).
  • published: The published year.
  • editions: The number of editions.
  • url: The book URL without query parameters.
  • coverUrl: The cover image URL.

Detailed Fields (When enableDetailedScraping = true)

  • pages: Number of pages in the book.
  • format: Book format (Paperback, Hardcover, Kindle, ebook, etc.).
  • firstPublished: Original publication date.
  • currentlyReading: Number of people currently reading the book.
  • wantToRead: Number of people who want to read the book.
  • description: Full book description/summary.
  • genres: Comma-separated list of book genres.
  • authorBooks: Number of books by the author.
  • authorFollowers: Number of author followers on Goodreads.

Usage

To use the actor:

  1. Provide the required input parameters in the actor’s input configuration.
  2. Deploy and run the actor on the Apify platform.
  3. The actor will fetch results page by page until the maximum number of books is reached or the maximum number of pages is scraped.
  4. The resulting data will be stored in the default Apify dataset.

Scraping Modes

This actor offers two scraping modes to balance speed and detail:

Detailed Mode (Default)

  • Comprehensive: Visits each book page individually
  • Rich data: Extracts additional fields like description, genres, format, pages, author stats
  • Slower: Takes more time due to individual page visits
  • Best for: Research, detailed analysis, complete book profiles

Simple Mode

  • Fast: Only extracts data from search result pages
  • Efficient: Processes multiple books quickly
  • Basic fields: title, author, rating, ratingsCount, published, editions, url, coverUrl
  • Best for: Large datasets, basic book information, performance-critical workflows

To enable simple mode: Set enableDetailedScraping to false in the input.

Make.com Integration

This actor is specifically optimized for Make.com (formerly Integromat) workflows:

  • delayBetweenRequests: 2000-5000ms (prevents rate limiting)
  • maxConcurrency: 1 (ensures stable workflow execution)
  • timeout: 60s (gives enough time for complex pages)
  • enableDetailedScraping: true (recommended for comprehensive data)

Workflow Benefits:

  • Stable Execution: Built-in rate limiting prevents API blocks
  • Error Handling: Clear success/failure indicators for conditional logic
  • Data Consistency: Proper data types for filtering and mapping operations
  • Metadata Tracking: Timestamps and positioning for data processing workflows
  • Proxy Support: Reliable scraping with Apify Proxy integration

Common Use Cases:

  • CRM Integration: Automatically add books to customer databases
  • Content Management: Populate book catalogs and recommendation systems
  • Research Automation: Collect book data for academic or market research
  • Social Media: Automate book-related content creation and posting
  • E-commerce: Build comprehensive book databases for online stores
  • Analytics: Collect detailed book metadata for market analysis

This tool is intended for personal, educational, and research purposes only. Please ensure your use complies with Goodreads' terms and conditions. All content scraped belongs to Goodreads and its respective owners.