Pricing

from $1.19 / 1,000 results

BookWyrm Book Reviews Scraper

Scrape public BookWyrm reviews, ratings, book metadata, and reviewer details by book name, book URL, or profile handle from federated BookWyrm instances.

Pricing

from $1.19 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Why use this BookWyrm scraper?

Collect public BookWyrm reviews from selected books, profile handles, or book searches
Build a public book review dataset from federated BookWyrm instances
Monitor public reviews from selected BookWyrm users
Enrich book-review records with reviewer and book metadata
Combine BookWyrm reviews with Goodreads, StoryGraph, Open Library, or other book data sources
Export clean book and review rows to a dataset, database, spreadsheet, or downstream analytics workflow

What you can extract

Review URL, title, rating, review text, original HTML, publication date, language, visibility, tags, and content warning
Reviewer profile URL, handle, display name, avatar URL, ActivityPub ID, public profile bio, outbox URL, and federation metadata
Book title, subtitle, authors, author aliases, author bio/ISNI links, ISBNs, cover image, BookWyrm book URL, work/edition URL, publisher, page count, subjects, languages, series, physical format, and visible identifiers
Public comments and quotes linked to books when enabled through advanced input
Optional raw ActivityPub and RSS fields for advanced workflows

Simple setup

Most runs use one of four inputs:

Any BookWyrm URLs: paste mixed public BookWyrm book, profile, review, shelf, list, RSS, or ActivityPub JSON URLs
Book URLs: scrape selected public BookWyrm book pages
Profiles: add one federated handle, profile URL, or instance | handle per line
Book search: add one book title, author, or ISBN per line

Example book URL:

https://bookwyrm.world/book/20954

Example profile:

sigvie@bookwyrm.world
mouse@bookwyrm.social
bookwyrm.world | sigvie
https://bookwyrm.world/user/sigvie

Example book search:

Min skyld Abid Raja
Sula Toni Morrison
9788202713461

Plain searches use a built-in BookWyrm search instance. To target a specific server, use instance | query, such as bookwyrm.social | Sula Toni Morrison.

Example input

{
  "startUrls": [
    "https://bookwyrm.world/book/20954"
  ],
  "books": [
    "https://bookwyrm.world/book/15515/s/dune"
  ],
  "profiles": [
    "sigvie@bookwyrm.world",
    "mouse@bookwyrm.social"
  ],
  "search": [
    "Min skyld Abid Raja",
    "Sula Toni Morrison"
  ],
  "maxReviews": 100,
  "maxSearchResults": 10
}

Example output

Book rows are emitted as soon as book metadata is available:

{
  "entityType": "book",
  "source": "bookwyrm",
  "sourceInstance": "https://bookwyrm.social",
  "activityPubId": "https://bookwyrm.social/book/2",
  "bookUrl": "https://bookwyrm.social/book/2/s/hamlet",
  "title": "Hamlet",
  "authors": [
    {
      "name": "William Shakespeare",
      "url": "https://bookwyrm.social/author/1/s/william-shakespeare",
      "activityPubId": "https://bookwyrm.social/author/1"
    }
  ],
  "isbn10": "0140714545",
  "isbn13": "9780140714548",
  "aggregateRating": 3.7,
  "reviewsCount": 272,
  "bookReviewDiscoveryStatus": "collected_from_html_public_page",
  "scrapedAt": "2026-05-23T10:21:23.443Z"
}

Every review is emitted as its own dataset row:

{
  "entityType": "review",
  "source": "bookwyrm",
  "sourceInstance": "https://bookwyrm.world",
  "activityPubId": "https://bookwyrm.world/user/sigvie/review/10524",
  "reviewUrl": "https://bookwyrm.world/user/sigvie/review/10524",
  "reviewType": "Article",
  "title": "Review of \"Min skyld\" (5 stars)",
  "rating": 5,
  "ratingScale": 5,
  "ratingSource": "activitypub",
  "reviewText": "Fantastisk og rørende bok. Elsker skrivinga!",
  "publishedAt": "2022-12-06T00:00:00.000Z",
  "visibility": "public",
  "reviewer": {
    "activityPubId": "https://bookwyrm.world/user/sigvie",
    "profileUrl": "https://bookwyrm.world/user/sigvie",
    "handle": "sigvie@bookwyrm.world",
    "displayName": "Sigurd Vie"
  },
  "book": {
    "activityPubId": "https://bookwyrm.world/book/20954",
    "bookUrl": "https://bookwyrm.world/book/20954",
    "title": "Min skyld",
    "authors": ["Abid Qayyum Raja"],
    "isbn13": "9788202713461"
  },
  "bookDetails": {
    "title": "Min skyld",
    "publisher": "Cappelen Damm",
    "publishedDate": "2021-08-11",
    "pageCount": 240,
    "languages": ["Norsk (Bokmål)"],
    "authors": [
      {
        "name": "Abid Qayyum Raja",
        "aliases": ["Abid Raja", "Abid Q. Raja"],
        "isni": "0000000041008712",
        "wikipediaLink": "https://da.wikipedia.org/wiki/Abid_Raja"
      }
    ],
    "bookwyrm": {
      "workUrl": "https://bookwyrm.world/book/20953",
      "physicalFormat": "Hardcover"
    }
  },
  "reviewerProfile": {
    "handle": "sigvie@bookwyrm.world",
    "displayName": "Sigurd Vie",
    "outboxUrl": "https://bookwyrm.world/user/sigvie/outbox"
  },
  "bookwyrm": {
    "repliesUrl": "https://bookwyrm.world/user/sigvie/review/10524/replies",
    "repliesCount": 0
  },
  "discoveryMethod": "rss_reviews",
  "scrapedAt": "2026-05-23T10:21:23.443Z"
}

Output

The dataset contains one standalone row for each discovered book and one standalone row for each public review. Reviews are never nested under books, so large runs stay easy to stream into spreadsheets, databases, dashboards, or analytics pipelines.

Book rows include the best available public metadata. Review rows include compact nested book and reviewer summaries, plus richer bookDetails and reviewerProfile fields when enrichment is available. Rows are streamed in small batches as they are collected, which lowers memory pressure on large runs and lets you inspect partial results before the run finishes.

For faster large book runs, full reviewer profile enrichment is off by default. Review rows still include the reviewer information visible on the review page, such as reviewer name, profile URL, and handle when available.

How the Actor gets BookWyrm data

The Actor uses the safest available public source first:

ActivityPub JSON for structured review, profile, book, and status data
RSS feeds for reliable profile-level review, comment, quote, and activity discovery
Public BookWyrm search pages when you provide book search queries
Public HTML fallback for visible metadata when structured sources do not expose enough data

It does not use browser automation by default. It does not log in, solve CAPTCHAs, bypass Cloudflare, bypass anti-bot pages, or access private/followers-only content.

For book URLs, the Actor follows public BookWyrm review pagination when it is visible on the book page. This is the cheapest way to collect visible reviews for a book because it uses normal HTTP requests and Cheerio parsing, not a browser.

ActivityPub support

BookWyrm often exposes ActivityPub JSON when a URL is requested with ActivityPub headers or when .json is appended to a public entity URL. The Actor supports public actors, outboxes, collections, collection pages, Review objects, Article review objects, Create activities, comments, quotes, books, shelves, and lists where those objects are exposed.

RSS support

When you provide profile handles, the Actor automatically checks public BookWyrm profile feeds where available:

/rss-reviews for public reviews
/rss for public activity
/rss-quotes for public quotes
/rss-comments for public comments

RSS feeds are often the best way to collect profile-level reviews. Some RSS items include ratings in the title, such as (5 stars). If a rating is not available in RSS or enrichment sources, the Actor returns rating: null and labels the source clearly.

Important coverage limits

BookWyrm is not one centralized review database. A book page on one instance may not expose all reviews from all BookWyrm servers. Profile-level scraping is usually more complete because profile RSS feeds and ActivityPub outboxes are scoped to that user.

The Actor does not pretend to scrape every review for a book unless the public page or ActivityPub JSON actually exposes those review links. When a book page exposes paginated public reviews, the Actor follows those pages until maxReviews, the hidden safety page limit, or the end of pagination is reached. When book-level review discovery is incomplete, the book row labels that limitation with bookReviewDiscoveryStatus.

Privacy and ethical use

This Actor is for public BookWyrm data only.

No login is required or supported
Private, followers-only, restricted, and login-only pages are skipped
Cloudflare challenges, CAPTCHAs, and access controls are not bypassed
robots.txt is respected where practical
Defaults use public HTML fallback, low concurrency, and polite delays

Use this Actor only for lawful, ethical collection of public data from instances you are allowed to access.

Troubleshooting

No reviews found

Add profiles as handles when possible. Book pages and book search results do not always expose review collections.

The instance returned 403, 404, or 410

The page may be private, deleted, restricted, unavailable through ActivityPub, or blocked by the instance. The Actor records the failed URL in run statistics and continues with other sources.

RSS ratings are missing

Some BookWyrm RSS feeds include ratings in titles, and some do not. If the Actor cannot find a rating in RSS, ActivityPub, or public HTML, it returns null instead of guessing.

A book page did not return all reviews

That is expected on some federated instances. Add profile handles for reviewers you care about to get better profile-level coverage.

The instance rate-limited requests

Lower the maximum number of reviews or search results and keep runs targeted to the profiles and books you need. The Actor uses polite built-in request delays and respects robots.txt where practical.

Pricing

Recommended Apify Store model: pay per event. The simplest setup is one charged dataset result event for every emitted row. This keeps pricing predictable because the dataset contains both book metadata rows and review rows.

apify-default-dataset-item: recommended starting price $0.00119 per dataset row ($1.19 per 1,000 book or review rows)
Keep Apify's synthetic apify-actor-start event at the default low price if using pay-per-event monetization

Stress testing showed that BookWyrm review density varies heavily by instance and book. Some searches return many low-review books, while popular books can return more visible public review links. A simple per-row price keeps small tests cheap while still covering discovery work for low-density or zero-result inputs.

For higher-volume customers, revisit pricing after real user runs and monitor cost per 1,000 rows, zero-result search rates, average reviews per book, and profile enrichment usage.

FAQ

Can this scrape BookWyrm reviews by book title?

Yes. Add book titles, authors, or ISBNs in Book search. To search one instance, use instance | query, such as bookwyrm.social | Dune Frank Herbert.

Can this scrape reviews from BookWyrm profiles?

Yes. Add federated handles like sigvie@bookwyrm.world or profile URLs. Profile RSS feeds are often the best public source for profile-level reviews.

Does it scrape every BookWyrm review for a book?

Only reviews exposed by the public book page, ActivityPub data, RSS feeds, or linked public pages are returned. BookWyrm is federated, so one instance may not show every review from every server.

Does it need a browser or login?

No. The Actor uses public HTTP, ActivityPub, RSS, and HTML pages. It does not log in or bypass private pages, CAPTCHAs, Cloudflare challenges, or followers-only content.

Why did I get book rows but few review rows?

Some books have little public review data on the selected instance. Try adding reviewer profile handles or targeted book URLs from the instance where the reviews are visible.

API usage

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")
actor = client.actor("thescrapelab/bookwyrm-book-reviews-scraper")

run_input = {
    "startUrls": [
        "https://bookwyrm.world/book/20954",
    ],
    "books": [
        "https://bookwyrm.world/book/15515/s/dune",
    ],
    "profiles": [
        "sigvie@bookwyrm.world",
        "mouse@bookwyrm.social",
    ],
    "search": [
        "Min skyld Abid Raja",
        "Sula Toni Morrison",
    ],
    "maxReviews": 100,
    "maxSearchResults": 10,
}

run = actor.call(run_input=run_input)
dataset = client.dataset(run["defaultDatasetId"])

for item in dataset.list_items().items:
    if item["entityType"] == "book":
        print("BOOK", item.get("title"), item.get("reviewsCount"))
    elif item["entityType"] == "review":
        print("REVIEW", item.get("reviewUrl"), item.get("rating"), item.get("book", {}).get("title"))

Book API

vivid_astronaut/book

Fabio Suizu

Goodreads Books & Reviews Scraper

fetch_cat/goodreads-books-reviews-scraper

Extract public Goodreads book metadata, ratings, and reader reviews from book URLs or search queries for research and book marketing workflows.

Hanna Nosova

Goodreads Book Reviews Scraper

seemuapps/goodreads-reviews-scraper

Scrape reviews from any Goodreads book. Get full review text, star rating, reviewer name, likes, shelves, and book metadata. No login required.

Andrew

Goodreads Reviews Scraper - Book Ratings and Text

benthepythondev/goodreads-reviews-scraper

Scrape public Goodreads book reviews with ratings, text, reviewer profiles, shelves, reactions and book metadata through a fast paginated no-login HTTP engine.

Ben

Goodreads Book Scraper

api-empire/goodreads-book-scraper

Scrape detailed book data with the Apify Goodreads Book Scraper. Extract titles, authors, ratings, reviews, genres, and publication info. Perfect for research, book analytics, and recommendation systems. Fast, accurate, and easy to integrate into any automation workflow.

API Empire

Goodreads Book Search

scrapier/goodreads-book-search

Scrape detailed book data with the Goodreads Book Scraper. Extract titles, authors, ratings, reviews, genres, and publication info from Goodreads. Perfect for book research, recommendation engines, and data analysis. Fast, reliable, and customizable for single or bulk scraping.

Scrapier

Goodreads Book Scraper - Ratings & Reviews

lulzasaur/goodreads-books-scraper

Scrape book data from Goodreads. Search by title or author. Extract ratings, reviews, page count, ISBN, genres, description, author info, and similar books from the world's largest book community.