# Goodreads Review Scraper (`kawsar/goodreads-review-scraper`) Actor

Goodreads review scraper that collects book reviews, star ratings, and reviewer profiles without login or authentication, giving authors and researchers clean data for sentiment analysis and competitive research.

- **URL**: https://apify.com/kawsar/goodreads-review-scraper.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** Developer tools, Automation, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Goodreads Review Scraper: Extract Book Reviews, Ratings, and Reviewer Profiles

Goodreads Review Scraper pulls reviews from any book on Goodreads using their internal API. No login or authentication needed. Just paste in one or more book URLs, set how many reviews you want per book, and run.

You get structured data including review text, star ratings, reviewer profiles, shelf labels, and tags, ready to export as CSV or JSON for analysis.

Unlike scrapers that parse HTML, this actor talks directly to the same API Goodreads uses in its own interface, so results come back fast and in a consistent format.

### Use cases

- **Sentiment analysis**: collect hundreds of reader reviews to train classifiers or run opinion mining on reader reactions to a book
- **Author research**: pull reviews for your own books or titles you compete with, and see what readers actually say
- **Book recommendation engines**: gather ratings and tags to power recommendation logic or similarity scoring
- **Market research**: analyze reader responses to new releases in a specific genre or category
- **Academic research**: build datasets of reader reviews for literary analysis or sociological studies
- **Publishing decisions**: check reader feedback on comparable titles before commissioning similar work

### Input

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `bookUrl` | string | | Single Goodreads book page URL. Use for one book at a time. |
| `bookUrls` | array | | List of Goodreads book page URLs, one per line. Use for batch runs across multiple books. |
| `maxReviews` | integer | 100 | Maximum reviews to collect per book. Hard cap of 1000. |
| `timeoutSecs` | integer | 300 | Overall actor timeout in seconds. |
| `requestTimeoutSecs` | integer | 30 | Per-request timeout in seconds. |
| `proxyConfiguration` | object | Datacenter (Anywhere) | Proxy type and location for requests. Supports Datacenter, Residential, Special, and custom proxies. Optional. |

You can use `bookUrl`, `bookUrls`, or both at the same time. Duplicate URLs are ignored.

#### Example: single book

```json
{
    "bookUrl": "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
    "maxReviews": 100
}
````

#### Example: multiple books

```json
{
    "bookUrls": [
        "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
        "https://www.goodreads.com/book/show/5907.The_Hobbit",
        "https://www.goodreads.com/book/show/7260188-the-hunger-games"
    ],
    "maxReviews": 200
}
```

### What data does this actor extract?

The actor stores results in Apify's dataset. Each entry is one review:

```json
{
    "reviewId": "kca://review:goodreads/amzn1.gr.review:goodreads.v1.tI_H8-8bJGQv1O4IIOpeTA",
    "reviewUrl": "https://www.goodreads.com/review/show/2280609898",
    "reviewText": "<b>4.5 Stars!</b><br /><br />Buddy read with...",
    "reviewTextPlain": "4.5 Stars! Buddy read with the one who started it all!...",
    "rating": 4,
    "createdAt": "2018-02-01T20:15:18+00:00",
    "updatedAt": "2021-02-19T16:45:57+00:00",
    "lastRevisionAt": "2018-03-20T18:24:36+00:00",
    "spoilerStatus": false,
    "likeCount": 114,
    "commentCount": 17,
    "recommendFor": null,
    "userId": 44125660,
    "userName": "Beth",
    "userUrl": "https://www.goodreads.com/user/show/44125660-beth",
    "userImageUrl": "https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/users/...",
    "isAuthor": false,
    "userFollowersCount": 627,
    "userTextReviewsCount": 929,
    "shelfName": "read",
    "shelfDisplayName": "Read",
    "shelfUrl": "https://www.goodreads.com/review/list/44125660?shelf=read",
    "tags": ["2018-reads", "audiobook", "book-series", "4-stars", "young-adult-fantasy"],
    "workId": "kca://work/amzn1.gr.work.v1.TbpxJa2CwiSSz_9W2FruoA",
    "scrapedAt": "2025-01-15T10:30:00+00:00"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `reviewId` | string | Unique review identifier |
| `reviewUrl` | string | Direct link to the review |
| `reviewText` | string | Full review text with HTML markup |
| `reviewTextPlain` | string | Review text with HTML stripped |
| `rating` | integer | Star rating (1 to 5, or null if no rating given) |
| `createdAt` | string | When the review was first posted |
| `updatedAt` | string | When the review was last updated |
| `lastRevisionAt` | string | When the review text was last revised |
| `spoilerStatus` | boolean | Whether the review is flagged as a spoiler |
| `likeCount` | integer | Number of likes |
| `commentCount` | integer | Number of comments |
| `recommendFor` | string | Who the reviewer recommends the book for, if set |
| `userId` | integer | Reviewer's numeric ID |
| `userName` | string | Reviewer's display name |
| `userUrl` | string | Link to the reviewer's profile |
| `userImageUrl` | string | Reviewer's profile picture URL |
| `isAuthor` | boolean | Whether the reviewer is a published author on Goodreads |
| `userFollowersCount` | integer | How many followers the reviewer has |
| `userTextReviewsCount` | integer | Total reviews written by the reviewer |
| `shelfName` | string | Shelf slug (e.g. `read`, `currently-reading`) |
| `shelfDisplayName` | string | Human-readable shelf name (e.g. `Read`) |
| `shelfUrl` | string | Link to the reviewer's shelf |
| `tags` | array | Custom shelf tags applied by the reviewer |
| `workId` | string | The book's internal Goodreads identifier |
| `scrapedAt` | string | When this review was collected |

### How it works

1. The actor fetches each book page URL and extracts the internal book identifier from the page
2. It then calls the Goodreads API using cursor-based pagination to collect reviews page by page
3. Each page returns up to 30 reviews; the actor keeps fetching until it hits your `maxReviews` limit or runs out of reviews
4. For batch runs, the actor processes each book in order and pushes all results into the same dataset
5. Each review is pushed to the dataset as it arrives

### FAQ

**Do I need a Goodreads account to use this?**
No. The actor requires no login or personal credentials.

**How many reviews can I collect per run?**
Up to 1000 per book. If you pass 5 book URLs with `maxReviews: 200`, you can collect up to 1000 reviews total in that run.

**Does this work for all books on Goodreads?**
It works for any book with publicly visible reviews on Goodreads.

### Integrations

Connect Goodreads Review Scraper with other apps using [Apify integrations](https://apify.com/integrations). You can pipe results into Google Sheets, send notifications via Slack, trigger workflows in Make or Zapier, or sync with Airbyte. You can also use [webhooks](https://docs.apify.com/integrations/webhooks) to trigger actions as soon as results are ready.

# Actor input Schema

## `bookUrl` (type: `string`):

Single Goodreads book page URL. Use this for one book. For multiple books at once, use the Book URLs list below.

## `bookUrls` (type: `array`):

List of Goodreads book page URLs to scrape. Add one URL per line. The actor collects up to Max reviews for each book.

## `maxReviews` (type: `integer`):

Maximum number of reviews to collect per book. Capped at 1000.

## `timeoutSecs` (type: `integer`):

Overall actor timeout in seconds.

## `requestTimeoutSecs` (type: `integer`):

Timeout for each individual API request in seconds.

## `proxyConfiguration` (type: `object`):

Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect.

## Actor input object example

```json
{
  "bookUrl": "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
  "bookUrls": [
    "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
    "https://www.goodreads.com/book/show/5907.The_Hobbit"
  ],
  "maxReviews": 100,
  "timeoutSecs": 300,
  "requestTimeoutSecs": 30,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "bookUrl": "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/goodreads-review-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "bookUrl": "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("kawsar/goodreads-review-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "bookUrl": "https://www.goodreads.com/book/show/3.Harry_Potter_and_the_Sorcerer_s_Stone",
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call kawsar/goodreads-review-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/goodreads-review-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Goodreads Review Scraper",
        "description": "Goodreads review scraper that collects book reviews, star ratings, and reviewer profiles without login or authentication, giving authors and researchers clean data for sentiment analysis and competitive research.",
        "version": "0.0",
        "x-build-id": "fDMCREDr6IF14dvEr"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~goodreads-review-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-goodreads-review-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~goodreads-review-scraper/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-goodreads-review-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~goodreads-review-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-goodreads-review-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "bookUrl": {
                        "title": "Book URL",
                        "type": "string",
                        "description": "Single Goodreads book page URL. Use this for one book. For multiple books at once, use the Book URLs list below."
                    },
                    "bookUrls": {
                        "title": "Book URLs",
                        "type": "array",
                        "description": "List of Goodreads book page URLs to scrape. Add one URL per line. The actor collects up to Max reviews for each book.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxReviews": {
                        "title": "Max reviews per book",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of reviews to collect per book. Capped at 1000.",
                        "default": 100
                    },
                    "timeoutSecs": {
                        "title": "Timeout (seconds)",
                        "minimum": 30,
                        "maximum": 3600,
                        "type": "integer",
                        "description": "Overall actor timeout in seconds.",
                        "default": 300
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Timeout for each individual API request in seconds.",
                        "default": 30
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
