# OnlineBookClub Book Reviews Scraper (`thescrapelab/onlinebookclub-book-reviews-scraper`) Actor

OnlineBookClub scraper for authorized public book reviews, ratings, reviewer names, metadata, genres, and purchase links.

- **URL**: https://apify.com/thescrapelab/onlinebookclub-book-reviews-scraper.md
- **Developed by:** [Inus Grobler](https://apify.com/thescrapelab) (community)
- **Categories:** SEO tools, E-commerce, Developer tools
- **Stats:** 2 total users, 1 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.99 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## OnlineBookClub Book Reviews Scraper

This Actor extracts public OnlineBookClub book review data and book metadata from authorized pages. It is designed for users who have permission to collect data from OnlineBookClub.org.

Use it as an authorized OnlineBookClub scraper for public official review text, ratings, reviewer display names, genres, book metadata, and purchase links.

**This Actor is for authorized use only. OnlineBookClub.org’s Terms prohibit scraping without express written permission. Use this Actor only if you have permission.**

### What This Actor Does

OnlineBookClub Book Reviews Scraper collects public official review listings, public official review text, and public book metadata from OnlineBookClub pages you are authorized to access. It is HTTP-based by default, uses polite request delays, and does not use browser automation unless a future debugging mode is explicitly added.

It does not log in, does not accept cookies, does not accept session tokens, does not create accounts, and does not bypass Cloudflare, captchas, bot challenges, rate limits, login walls, or access controls.

### Who It Is For

Use this Actor if you have written permission to collect public OnlineBookClub review data for research, catalog enrichment, book discovery workflows, review monitoring, or internal analysis.

Do not use it for private pages, member-only discussions, login-only replies, contact harvesting, or account-based access.

### Data Extracted

- Review listing details: book title, author, rating, reviewer display name, review URL, listing date text, genre, reply count, book URL, cover image URL, and public purchase links.
- Full public official review pages: review title, topic ID, reviewer display name, public profile URL when visible, post date, declaration text, book title, author, rating, review text, book URL, cover image URL, and public purchase links.
- Public book metadata: book ID, title, author, author URL, genre, release date, word count, language, average reviewer rating, official review link, official review rating, cover image URL, and public purchase links.

Replies that require login are not collected. If a page says login is required to view replies, the Actor skips replies and records a warning in run statistics.

### Simple Setup

1. Confirm you have permission to collect data from OnlineBookClub.org.
2. Set **I have permission to collect this data** to `true`.
3. Add review index URLs, official review URLs, or book page URLs in **Start URLs**.
4. Choose the maximum number of reviews.
5. Start the Actor and open the Dataset while it is running to see review records appear progressively.

The input form is intentionally simple. The Actor automatically detects URL types and uses built-in polite crawling defaults.

### Input Examples

#### Scrape Latest Review Listings

```json
{
  "confirmAuthorizedUse": true,
  "startUrls": [
    "https://onlinebookclub.org/reviews/"
  ],
  "maxReviews": 100
}
````

#### Scrape Specific Review URLs

```json
{
  "confirmAuthorizedUse": true,
  "startUrls": [
    "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520"
  ],
  "maxReviews": 25
}
```

#### Scrape Specific Book Pages

```json
{
  "confirmAuthorizedUse": true,
  "startUrls": [
    "https://onlinebookclub.org/shelves/book.php?id=728007"
  ],
  "maxReviews": 25
}
```

### Output

The Actor always outputs one dataset row per review item. Book metadata, cover images, public purchase links, and listing details are included in the same review row when available.

### Example Output

```json
{
  "entityType": "review",
  "source": "onlinebookclub",
  "reviewUrl": "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520",
  "reviewTitle": "Review of Souls Run Wild",
  "bookTitle": "Souls Run Wild",
  "authorName": "David Payne",
  "genre": "Historical Fiction",
  "rating": 4,
  "ratingScale": 5,
  "normalizedRatingOutOf5": 4,
  "reviewerName": "Amanda Collier",
  "postedAt": "2026-05-13T22:08:00",
  "reviewText": "Full public review text...",
  "bookUrl": "https://onlinebookclub.org/shelves/book.php?id=728007",
  "scrapedAt": "2026-05-29T00:00:00.000Z"
}
```

### Rating Scales

OnlineBookClub reviews may use different rating scales. Some older reviews use a 4-star scale, such as `3 out of 4 stars`, while newer reviews may use a 5-star scale.

The Actor preserves:

- `rating`
- `ratingScale`
- `ratingText`
- `normalizedRatingOutOf5`

For example, `3 out of 4 stars` is preserved as rating `3`, scale `4`, and normalized rating `3.75` out of 5.

### Important Limitations

- Authorized use only.
- No login support.
- No cookies or session tokens.
- No private or member-only pages.
- No account creation.
- No bypassing Cloudflare, captchas, bot challenges, rate limits, or access controls.
- Replies requiring login are skipped.
- Purchase links are stored as public links found on the page and are not followed.
- Pages that require JavaScript may still be parsed only when public HTML content is present.

### Recommended Run Settings

Based on fixture stress tests and the HTTP-only design:

```json
{
  "memoryMbytes": 512,
  "timeoutSecs": 3600
}
```

Small run, up to 50 reviews:

- Memory: 512 MB
- Timeout: 15-30 minutes

Medium run, up to 250 reviews:

- Memory: 512 MB
- Timeout: 1-2 hours
- Use 1024 MB only if your run includes large book enrichment or unusually large pages.

Large authorized run:

- Keep browser automation disabled.
- Estimate timeout from `maxReviews` and page response speed.
- Increase timeout before increasing memory unless memory warnings appear.

### Cost Control

- Lower `maxReviews` to reduce runtime and compute cost.
- Keep browser automation disabled.
- Use 512 MB memory unless testing shows higher memory is needed.
- Increase timeout only for larger authorized runs.

### Troubleshooting

Actor refuses to run because `confirmAuthorizedUse` is false:
Set it to `true` only if you have permission to collect data from OnlineBookClub.org.

No reviews found:
Check that your URLs are public review index, review, or book pages and that robots.txt allows the requested paths.

Review page has login-required replies:
Replies are skipped. Public official review text can still be extracted when visible without login.

Book page missing average rating:
Some books do not show an average reviewer rating until published reviews exist. The field will be `null`.

Rating scale is 4 instead of 5:
This is expected for older reviews. The original scale is preserved and `normalizedRatingOutOf5` is also provided.

Date has inferred year:
Some listing dates omit the year. The Actor infers the year using the run date and marks `dateParseConfidence` as `inferred_year`.

Site returned 403 or 429:
The Actor does not aggressively retry forbidden pages. For 429 rate limits, it respects `Retry-After` when provided.

Challenge or access-denied page detected:
The Actor stops or skips the page and records a warning. It will not attempt to bypass the restriction.

Some purchase links are redirects:
The Actor stores public href values as found and does not follow affiliate or redirect links.

Some pages require JavaScript but public content was still parseable:
The Actor may continue when public HTML content is present. It will not use JavaScript execution to bypass restrictions.

Dataset appears empty at first:
For full review runs, the first records appear after the first review page and optional book metadata page are processed.

Run timed out before finishing:
Reduce `maxReviews` or increase timeout for larger authorized runs.

Memory limit exceeded:
Use 512 MB for normal runs and 1024 MB for larger enriched runs.

### Pricing Suggestion

Recommended pay-per-event pricing:

- `review-scraped`: suggested launch price $0.00099 per successful review row ($0.99 per 1,000).

Charge only for successful review rows. Do not charge for failed requests, duplicate records, skipped restricted pages, blocked/challenge pages, or login-required replies. Keep Apify's default synthetic Actor start event enabled.

### API Example

```python
from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run_input = {
    "confirmAuthorizedUse": True,
    "startUrls": ["https://onlinebookclub.org/reviews/"],
    "maxReviews": 25,
}

run = client.actor("TheScrapeLab/onlinebookclub-book-reviews-scraper").call(run_input=run_input)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)
```

### FAQ

Can this Actor scrape private member data?
No. It is designed for public authorized pages only.

Can I provide login credentials or cookies?
No. Login, cookies, and session tokens are not supported.

Does it use browser automation?
No browser automation is used by default. The Actor is optimized for low-cost HTTP extraction.

Does it bypass blocking?
No. It stops or skips pages when challenges, access-denied pages, or login requirements are detected.

Can I collect forum replies?
Only public replies visible without login may be considered in a future version. Login-required replies are skipped.

### Changelog

#### 1.0.0

- Initial authorized-use-only Actor.
- Public review index, official review page, and book page extraction.
- Low-concurrency HTTP client.
- Rating/date parsing with preserved original scales.
- Streaming dataset output and RUN-STATS reporting.

### Support

For support, contact the Actor maintainer through your Apify support or marketplace contact channel.

# Actor input Schema

## `confirmAuthorizedUse` (type: `boolean`):

Required. This Actor is for authorized use only. OnlineBookClub.org's Terms prohibit scraping without express written permission. Set this to true only if you have permission.

## `startUrls` (type: `array`):

Paste review index pages, official review URLs, or book page URLs. The Actor detects the URL type automatically.

## `maxReviews` (type: `integer`):

Maximum number of review records to push. Lower values reduce runtime and cost.

## Actor input object example

```json
{
  "confirmAuthorizedUse": false,
  "startUrls": [
    "https://onlinebookclub.org/reviews/",
    "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520",
    "https://onlinebookclub.org/shelves/book.php?id=728007"
  ],
  "maxReviews": 100
}
```

# Actor output Schema

## `results` (type: `string`):

Streamed flat review items. Each dataset row represents one review item with book metadata and purchase links included when available.

## `runStats` (type: `string`):

RUN-STATS key-value store record with counts, warnings, and peak memory metrics.

## `keyValueStore` (type: `string`):

Default key-value store containing run statistics and restart checkpoints.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://onlinebookclub.org/reviews/",
        "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520",
        "https://onlinebookclub.org/shelves/book.php?id=728007"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("thescrapelab/onlinebookclub-book-reviews-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [
        "https://onlinebookclub.org/reviews/",
        "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520",
        "https://onlinebookclub.org/shelves/book.php?id=728007",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("thescrapelab/onlinebookclub-book-reviews-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://onlinebookclub.org/reviews/",
    "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520",
    "https://onlinebookclub.org/shelves/book.php?id=728007"
  ]
}' |
apify call thescrapelab/onlinebookclub-book-reviews-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=thescrapelab/onlinebookclub-book-reviews-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "OnlineBookClub Book Reviews Scraper",
        "description": "OnlineBookClub scraper for authorized public book reviews, ratings, reviewer names, metadata, genres, and purchase links.",
        "version": "1.0",
        "x-build-id": "B8i9YBRt4Tc8lprFx"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/thescrapelab~onlinebookclub-book-reviews-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-thescrapelab-onlinebookclub-book-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/thescrapelab~onlinebookclub-book-reviews-scraper/runs": {
            "post": {
                "operationId": "runs-sync-thescrapelab-onlinebookclub-book-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/thescrapelab~onlinebookclub-book-reviews-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-thescrapelab-onlinebookclub-book-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "confirmAuthorizedUse"
                ],
                "properties": {
                    "confirmAuthorizedUse": {
                        "title": "I have permission to collect this data",
                        "type": "boolean",
                        "description": "Required. This Actor is for authorized use only. OnlineBookClub.org's Terms prohibit scraping without express written permission. Set this to true only if you have permission.",
                        "default": false
                    },
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Paste review index pages, official review URLs, or book page URLs. The Actor detects the URL type automatically.",
                        "items": {
                            "type": "string"
                        },
                        "default": [
                            "https://onlinebookclub.org/reviews/",
                            "https://forums.onlinebookclub.org/viewtopic.php?f=63&t=745520",
                            "https://onlinebookclub.org/shelves/book.php?id=728007"
                        ]
                    },
                    "maxReviews": {
                        "title": "Maximum reviews",
                        "minimum": 1,
                        "maximum": 100000,
                        "type": "integer",
                        "description": "Maximum number of review records to push. Lower values reduce runtime and cost.",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
