# Capterra 0.9$💰 Reviews Products, Categories & Compare Scraper (`memo23/capterra-scraper`) Actor

\[Only 0.9$💰] Capterra all-in-one scraper: product, reviews, category & compare URLs in one actor. Flat per-review rows (100% field fill: reviewer name, company size, recommendation score, incentivization), Capterra awards & badges, sentiment breakdown, structured integrations. Pure HTTP, no browser

- **URL**: https://apify.com/memo23/capterra-scraper.md
- **Developed by:** [Muhamed Didovic](https://apify.com/memo23) (community)
- **Categories:** Developer tools, Automation, Agents
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.90 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Capterra Software Reviews, Products & Categories Scraper


### How it works

<p align="center">
  <img src="readme-stuff/how-it-works-capterra.svg" alt="How it works" width="100%" />
</p>

All-in-one Capterra.com scraper — paste any mix of URL kinds in the same input list and the actor auto-classifies each one:

| Input | Row(s) emitted |
|---|---|
| **Product URL** — `/p/{id}/{slug}/` | 1 product row: name, description, application category, rating, vendor, pricing, features, screenshots, alternatives, review snippets, **awards & badges (Best Value, Shortlist, Ease of Use)**, **structured integrations** (with logos + Capterra URLs), **per-sub-rating averages with reviewer counts**, **Capterra-computed sentiment breakdown** |
| **Reviews URL** — `/p/{id}/{slug}/reviews/` | **N review rows** — one per individual review (`rowType: "review"`). Each row carries a nested `productAggregate` object with the thread-level rating distribution, sub-rating averages, total reviews, etc. No separate summary row. |
| **Category URL** — `/{slug}-software/` (e.g. `/project-management-software/`) | **N category-product rows** — one per product in the category (`rowType: "category-product"`). Each row carries a nested `categoryAggregate` with the category name, description, total count, and related categories. Same flat-with-nested-aggregate pattern as reviews. |
| **Compare URL** — `/compare/{ids}/{slug-a}-vs-{slug-b}/` | 1 compare row: side-by-side products with ratings, pricing, deployments |

Bare numeric product IDs (`137005`) and bare category slugs (`project-management`) also work. JSON + CSV.

> **Reviews are flat with the aggregate denormalized into every row.** Every dataset row from a reviews URL is a single review. The thread-level aggregate (rating distribution, sub-rating averages, total reviews) lives under each row's `productAggregate` field — duplicated on every review row from the same URL, but that keeps each row self-contained and makes billing trivial: one row = one paid `outputrecord` event.

> Pure HTTP. No Puppeteer, no Playwright, no headless Chromium, no third-party Cloudflare-bypass service.

---

Capterra is **Cloudflare-gated** with stronger deep-page protection than most B2B-software sites. The actor breaks past it without a browser via a session-aware HTTP pattern:

1. **Warm session.** Open one sticky-IP residential proxy session, hit a known-good product page, retry until 200. Cloudflare sets `__cf_bm` + `_cfuvid` cookies on success — captured automatically by the impit cookie jar.
2. **Iterate URLs.** Reuse the same Impit instance + sticky IP for every subsequent URL. Cookie + IP persistence pushes per-URL pass rate from a cold ~25% to a warm ~80%, with per-URL retry bringing cumulative success to ~95%.
3. **Auto-refresh.** On 3 consecutive 403s, drop the session and re-warm with a fresh IP. No human intervention needed.

Every page is then parsed via JSON-LD (the rock-solid `SoftwareApplication` schema + `BreadcrumbList`) plus targeted Cheerio walks of the visible HTML for sections (Features, Pricing, Integrations, Support, Alternatives, User reviews). No DOM execution; no DOMContentLoaded wait.

---

### Input

Field | Type | Required | Notes
--- | --- | --- | ---
`startUrls` | `string[]` | yes | Any mix of product / reviews / category / compare URLs (or bare product IDs / category slugs). Auto-classified.
`maxItems` | `integer` | no | Safety cap on **total dataset rows**, not URLs. A reviews URL with 100 reviews emits 100 review rows; a category URL with 50 products emits 50 category-product rows. Default `1000`. Free-tier users are capped at `100`.
`maxReviewsPerProduct` | `integer` | no | Max individual review rows emitted per reviews URL. Default `100`, `0` = no limit.
`maxProductsPerCategory` | `integer` | no | Max individual category-product rows emitted per category URL. Default `50`, `0` = no limit.
`maxRequestRetries` | `integer` | no | Per-URL retry budget inside the warm session. Default `8`.
`proxy` | object | no | Apify residential proxy recommended (US country). Required because Capterra rejects datacenter IPs.

#### Example input

```json
{
  "startUrls": [
    "https://www.capterra.com/p/137005/SmartPM/",
    "https://www.capterra.com/p/137005/SmartPM/reviews/",
    "https://www.capterra.com/project-management-software/",
    "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/"
  ],
  "maxItems": 50,
  "maxReviewsPerProduct": 100,
  "maxProductsPerCategory": 50,
  "proxy": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"], "apifyProxyCountry": "US" }
}
````

That yields **90 rows** on SmartPM: 1 product + 46 individual review rows (each with `productAggregate` nested) + 42 category-product rows from `/project-management-software/` (each with `categoryAggregate` nested) + 1 compare. Use `maxProductsPerCategory` / `maxReviewsPerProduct` to cap the per-URL fan-out.

***

### Output schema

Four row shapes, distinguished by the `rowType` field: `product`, `review`, `category-product`, `compare`.

#### `rowType: "product"`

```jsonc
{
  "rowType":             "product",
  "productUrl":          "https://www.capterra.com/p/137005/SmartPM/",
  "productId":           "137005",
  "name":                "SmartPM",
  "description":         "SmartPM is an AI-powered Automated Project Controls™ platform built by construction for construction…",
  "applicationCategory": "BusinessApplication",
  "categories":          ["Construction Software", "Project Management Software"],
  "rating": {
    "average":           4.9,
    "count":             46,
    "bestRating":        5
  },
  "vendor": {
    "name":              "SmartPM",
    "websiteUrl":        null
  },
  "pricingStarts":       "$49",
  "logoUrl":             "https://gdm-catalog-fmapi-prod.imgix.net/ProductLogo/7c88b5d2-…",
  "screenshotUrls":      [ /* up to ~22 imgix URLs */ ],
  "features":            [ /* visible feature list */ ],
  "deployments":         [ "Cloud", "Web-Based" ],
  "trainingOptions":     [ "Live Online", "Documentation", "Webinars" ],
  "supportOptions":      [ "Email/Help Desk", "Phone Support", "Chat" ],
  "customerTypes":       [ "Small Business", "Mid-size", "Enterprise" ],
  "reviewSnippets": [
    {
      "title":           "SmartPM, an amazing tool for schedule analysis",
      "pros":            "I had a 2 year experience using SmartPM…",
      "cons":            "Some advanced features have a learning curve…",
      "overallRating":   null,
      "reviewer":        { "name": null, "jobTitle": null, "industry": null, "companySize": null },
      "source":          "capterra"
    }
    /* up to 10 surfaced on the product page */
  ],
  "alternatives": [
    { "productId": "183213", "productUrl": "https://www.capterra.com/p/183213/Contractor-Foreman/", "name": "Contractor Foreman", "rating": { "average": null, "count": null, "bestRating": null }, "vendor": null, "logoUrl": null, "pricingStarts": null, "sponsored": null }
    /* + ~4 more */
  ],
  "breadcrumbs": [
    { "name": "Capterra", "url": "https://www.capterra.com/" },
    { "name": "Construction Software", "url": "https://www.capterra.com/construction-software/" },
    { "name": "SmartPM", "url": "https://www.capterra.com/p/137005/SmartPM/" }
  ],

  // ── RSC-derived rich data (from Capterra's embedded product JSON) ──
  "awards": [
    {
      "title":           "2025 Capterra Best Value for Money",
      "badgeUrl":        "/public-bx-capterra-v0/badge-best-value.png",
      "badgeAlt":        "Best Value Badge",
      "additionalInfo":  "and in",
      "linkText":        "+4 categories",
      "linkAction":      "best-value"
    }
  ],
  "valueBadges": [
    { "name": "Business Intelligence",   "year": "2025", "categorySlug": "business-intelligence-software" },
    { "name": "Construction Management", "year": "2025", "categorySlug": "construction-management-software" },
    { "name": "Data Analysis",           "year": "2025", "categorySlug": "data-analysis-software" },
    { "name": "Project Planning",        "year": "2025", "categorySlug": "project-planning-software" }
  ],
  "shortlistBadges":   [],                                  // populated when product is in Capterra Shortlist reports
  "easeOfUseBadges":   [],                                  // populated when product holds "Best Ease of Use" awards
  "integrations": [
    {
      "name":            "Microsoft Power BI",
      "id":              "e3df9c2d-1567-427c-acb2-a6d200b52931",
      "logo":            "https://gdm-catalog-fmapi-prod.imgix.net/ProductLogo/e48bb429-…png",
      "href":            "https://www.capterra.com/p/176586/Power-BI/",
      "reviewCount":     1878
    }
    /* + 7 more */
  ],
  "usedFor": [
    { "name": "Construction Scheduling", "slug": "construction-scheduling-software" },
    { "name": "Construction Management", "slug": "construction-management-software" }
    /* + 6 more */
  ],
  "pricing": {
    "startingPrice":         null,                          // raw value — may be blank when not disclosed
    "startingPriceCurrency": "$$",
    "startingPriceUnit":     "",
    "pricingModel":          "Other",
    "paymentFrequency":      "Per Month",
    "hasFreeTrial":          true,
    "hasFreeVersion":        false
  },
  "subRatingAverages": {                                    // Capterra's authoritative per-sub-rating means + counts
    "easeOfUse":       { "rating": 4.7, "reviews": 46 },
    "customerService": { "rating": 5,   "reviews": 45 },
    "valueForMoney":   { "rating": 4.9, "reviews": 42 }
  },
  "reviewSentiment": {                                      // Capterra-computed sentiment breakdown
    "totalReviews":  46,
    "overallRating": 4.87,
    "positive":      { "count": 96, "percentage": 96 },
    "neutral":       { "count":  4, "percentage":  4 },
    "negative":      { "count":  0, "percentage":  0 }
  },

  "scrapedAt":           "2026-05-12T13:35:00.412Z"
}
```

#### `rowType: "review"` (one per individual review)

Every dataset row from a reviews URL is a single review. Product info (`productId`, `productName`, `reviewsUrl`, `productUrl`, `reviewUrl`) and the **thread-level aggregate** (`productAggregate`) are denormalized onto every row, so each row is fully self-contained — no joining required for CSV consumers.

Per-review fields come directly from Capterra's embedded RSC JSON (one structured object per review, decoded brace-by-brace), giving **100% fill on every field the source provides**: reviewer name / job title / industry / company size / profile pic / LinkedIn verification, all four sub-ratings, recommendation score (0-10), incentivization status, plus structured `alternativeProducts[]` and `switchedProducts[]` arrays for the competitor products each reviewer mentioned, plus the vendor's reply when one exists.

The actor **auto-paginates** through every review page (following each page's `?page=N` link). `productAggregate.totalReviews` and `productAggregate.pagesScraped` reflect the full thread up to `maxReviewsPerProduct`. On SmartPM that's 46 reviews across 2 pages; on bigger products it can be hundreds across 10+ pages.

```jsonc
{
  "rowType":             "review",
  "reviewsUrl":          "https://www.capterra.com/p/137005/SmartPM/reviews/",
  "productId":           "137005",
  "productName":         "SmartPM",

  // ── Per-review fields ────────────────────────────────────────────
  "reviewId":            "Capterra___6325867",                // stable primary key
  "reviewUrl":           "https://www.capterra.com/p/137005/x/#Capterra___6325867",
  "productUrl":          "https://www.capterra.com/p/137005/x/",
  "title":               "SmartPM Elevates Company Schedule Performance & Health",
  "overallRating":       5,
  "recommendationScore": 10,                                  // 0–10 NPS, when provided
  "subRatings":          { "easeOfUse": 5, "customerService": 5, "features": 5, "valueForMoney": 5 },
  "text":                "Our SmartPM experience has been fantastic - integration, support, implementation - everything!",
  "pros":                "SmartPM has already proved itself to be invaluable to our company…",
  "cons":                "Some advanced features have a learning curve…",
  "adviceToOthers":      null,                                // Optional "Advice to others" text
  "reasonsForChoosing":  null,                                // "Reasons for Choosing" (~20% fill)
  "reasonsForSwitching": null,                                // "Reasons for Switching" (~11% fill)
  "switchedFrom":        "Deltek Acumen",                     // Name of previous tool (= switchedProducts[0].productName)
  "switchedProducts": [                                       // Full structured list (rare — 4% fill on SmartPM)
    { "productId": "10001903", "productSlug": "Deltek-Acumen", "productName": "Deltek Acumen" }
  ],
  "alternativeProducts": [                                    // Competitors the reviewer ALSO evaluated (9% fill on SmartPM)
    { "productId": "10012584", "productSlug": "Schedule-Validator", "productName": "Schedule Validator" }
  ],
  "publishedAt":         "May 31, 2024",
  "timeUsed":            "1-2 years",                         // "2+ years" / "6-12 months" / "Less than 6 months" / "1-2 years"
  "frequencyOfUse":      null,
  "reviewer": {
    "name":              "John M.",                           // null when reviewer is anonymous ("Verified User")
    "jobTitle":          "Scheduler",
    "industry":          "Construction",
    "companySize":       "201-500 employees",                 // "Self-employed" / "1-10 employees" / … / "10,001+ employees"
    "profilePicUrl":     "https://reviews.capterra.com/cdn/profile-images/linkedin/e7e3268d2b...jpeg",
    "verifiedLinkedIn":  false,
    "validationsPassed": ["ProofOfLink"]                      // Methods Capterra used to validate the reviewer
  },
  "verified":            true,                                // Capterra "isValidated" flag
  "incentivization":     "vendor-referred-incentive",         // "none" | "vendor-referred" | "nominal-gift" | "vendor-referred-incentive" | "unknown"
  "source":              "capterra:incentivized",             // back-compat string form of `incentivization`
  "reviewSourceTooltip": "Vendor Referred - Incentive Offered: This reviewer was invited by the software vendor…",
  "vendorResponse": {                                         // null when the vendor didn't reply (typical)
    "date":              "July 11, 2022",
    "text":              "Thanks Joe! It's been great working with you!"
  },

  // ── Thread-level aggregate (same on every review row from the same URL) ──
  "productAggregate": {
    "rating":            { "average": 4.9, "count": 46, "bestRating": 5 },
    "subRatingAverages": { "easeOfUse": 4.7, "customerService": 5, "features": 4.8, "valueForMoney": 4.9 },
    "ratingDistribution": [
      { "stars": 5, "count": 21, "percentage": 84 },
      { "stars": 4, "count":  2, "percentage":  8 },
      { "stars": 3, "count":  2, "percentage":  8 },
      { "stars": 2, "count":  0, "percentage":  0 },
      { "stars": 1, "count":  0, "percentage":  0 }
    ],
    "commonPros":        [],
    "commonCons":        [],
    "totalReviews":      46,
    "pagesScraped":      2
  },

  "scrapedAt":           "2026-05-12T10:55:14.123Z"
}
```

#### `rowType: "category-product"` (one per product listed in the category)

Same flat-with-nested-aggregate pattern as reviews. Every dataset row from a category URL is a single product listed in that category. The category-level aggregate (name, slug, description, total count, related categories) is denormalized under `categoryAggregate` on every row.

```jsonc
{
  "rowType":             "category-product",
  "categoryUrl":         "https://www.capterra.com/project-management-software/",
  "categorySlug":        "project-management",

  // ── Per-product fields (different on every row) ──────────────────
  "productId":           "120390",
  "productUrl":          "https://www.capterra.com/p/120390/Teamwork-Projects/",
  "name":                "Teamwork Projects",
  "description":         null,
  "rating":              { "average": 4.5, "count": 1234, "bestRating": 5 },
  "vendor":              null,
  "logoUrl":             "https://gdm-catalog-fmapi-prod.imgix.net/…",
  "pricingStarts":       null,
  "sponsored":           false,

  // ── Category-level aggregate (same on every category-product row from the same URL) ──
  "categoryAggregate": {
    "categoryUrl":       "https://www.capterra.com/project-management-software/",
    "categorySlug":      "project-management",
    "categoryName":      "Project Management Software",
    "description":       "Project management software helps teams plan, track, and deliver work …",
    "totalProductsOnPage": 42,
    "relatedCategories": [
      { "name": "Task Management Software", "url": "https://www.capterra.com/task-management-software/" }
      /* up to 30 */
    ]
  },

  "scrapedAt":           "2026-05-12T..."
}
```

#### `rowType: "compare"`

```jsonc
{
  "rowType":     "compare",
  "compareUrl":  "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/",
  "products": [
    { "productId": "137005", "name": "SmartPM", "rating": null, "vendor": null, ... },
    { "productId": "229614", "name": "Notion",  "rating": null, "vendor": null, ... }
  ],
  "summary": {
    "ratings":     [ { "average": 4.9, "count": 46, "bestRating": 5 }, { "average": 4.7, "count": 5000, "bestRating": 5 } ],
    "pricingStarts": [ "$49", "Free" ],
    "vendors":     [ "SmartPM", "Notion Labs" ],
    "deployments": [ ["Cloud"], ["Cloud", "Mobile"] ]
  }
}
```

> **Compare rows are populated by per-product fetch.** Capterra's `/compare/...` pages are 100% client-rendered SPAs — their HTML carries no product data. Instead we read the product IDs from the URL itself and fetch each compared product's own page through the warm session, giving full ratings + vendor + pricing + deployments per side-by-side product. Cost: N extra HTTP calls per compare URL (where N = number of products compared, usually 2–3).

***

### Pricing

Pay-per-event. **Both reviews and category products are billed flat** — every review row is one `outputrecord`, every category-product row is one `outputrecord`. Cost = `rows × outputrecord-price`. No nested-array arithmetic.

Event | When | Suggested rate
\--- | --- | ---
`outputrecord` | Once per dataset row pushed — `product`, `review`, `category-product`, `compare`. The thread/category aggregate is nested INSIDE each flat row, not charged separately. | configured on the actor page
`additional-data` | Once per nested item inside a `product` or `compare` row (features, screenshots, alternatives, breadcrumbs, compared products). Reviews and category-products are NEVER billed here — each is its own paid `outputrecord` row. | **$0.75 per 1,000 items** ($0.00075 each)

#### What counts as `additional-data`

Only the URL kinds that still keep nested arrays bill here. Reviews and category products never do.

Row type | Items billed as `additional-data`
\--- | ---
**Product row** | every feature + screenshot + review snippet + alternative + breadcrumb category
**Compare row** | every compared product

***

### What makes this richer than the competition

We surveyed all 16 Capterra actors on the Apify marketplace (May 2026). Most cover one URL surface only:

Capability | Competitor actors | **This actor**
\--- | --- | ---
Product + reviews + category + compare in **one actor** | ❌ (split across 4+ SKUs) | ✅
URL auto-classification (paste any mix) | ❌ | ✅
Pure HTTP, no headless browser | mixed (some use Playwright) | ✅
No paid third-party CF-bypass service | mixed | ✅
Cookie-warmed sticky session pattern | ❌ | ✅
Per-product alternatives + screenshots in product row | ❌ (separate calls) | ✅
Flat-per-review billing (1 review = 1 paid row, no nested-array math) | mixed | ✅
Thread-level aggregate (rating distribution + sub-rating averages) **on every review row** | ❌ (none surface this at all) | ✅
RSC-stream embedded review JSON extraction — 100% fill on `reviewId`, `companySize`, `recommendationScore`, `incentivization`, etc. | partial (some fields 0%) | ✅
Capterra **awards & badges** (Best Value, Shortlist, Ease-of-Use) per product | ❌ | ✅
**Structured integrations** per product (name + logo + Capterra URL + review count, not just feature strings) | ❌ | ✅
Capterra-computed **review sentiment breakdown** (positive / neutral / negative %) | ❌ | ✅
Per-sub-rating averages **with reviewer counts** (more precise than mean alone) | partial | ✅

The top marketplace actor (`focused_vanguard/multi-platform-reviews-scraper`, 220 users) covers multiple sources (Capterra + G2 + Trustpilot + Gartner + Software Advice + Reddit) but only emits review data. We do everything Capterra exposes in one actor — products, reviews, categories, comparisons — without crossing into other platforms.

***

### Notes & limitations

- **Capterra routes by numeric product ID** — the slug in the URL is decorative. `/p/137005/Slack/` redirects to whatever product 137005 actually is (currently SmartPM).
- **Cloudflare gating.** We rely on residential-proxy IP rotation + cookie reuse to slip past CF's WAF. With Apify residential pool (US country), expect ~95% per-URL success on a normal-cost run.
- **Reviews pagination** is automatic — the actor follows `?page=N` links from each page's own HTML (not constructed URLs, since `/x/` placeholder slugs lose query-strings on redirect). Stops when no next-page link found or `maxReviewsPerProduct` is hit. Each individual review is then emitted as its own flat dataset row with the thread aggregate nested in `productAggregate`.
- **Category flattening.** Same pattern as reviews — every product listed in a category becomes its own dataset row with `categoryAggregate` nested. A category URL with 50 products fills 50 rows of the `maxItems` budget (capped by `maxProductsPerCategory`).
- **Compare extraction** uses per-product fetch — Capterra's compare pages are SPAs, so we read product IDs from the URL and fetch each compared product's full page through the warm session. Result: rich compare rows with real ratings, vendor info, pricing, and deployments per product.
- **No mobile app.** Capterra is web-only; no API to bypass the WAF via.

***

### Support

Issues, feature requests, or custom output shapes? Open a ticket on the actor's Apify page or message the maintainer.

# Actor input Schema

## `startUrls` (type: `array`):

Paste any mix of:
• Product URLs — `https://www.capterra.com/p/{id}/{slug}/`
• Reviews URLs — `https://www.capterra.com/p/{id}/{slug}/reviews/`
• Category URLs — `https://www.capterra.com/{slug}-software/`
• Compare URLs — `https://www.capterra.com/compare/{ids}/{slug-a}-vs-{slug-b}/`

Bare numeric product IDs (e.g. `137005`) and bare category slugs (e.g. `project-management`) also work. Each entry is auto-classified and emits the matching row type — one row per URL.

## `maxItems` (type: `integer`):

Safety cap on total dataset rows. Product / compare URLs each emit 1 row. A reviews URL emits one row per individual review (each carrying the thread-level aggregate under `productAggregate`); a category URL emits one row per product listed (each carrying the category-level aggregate under `categoryAggregate`). So a reviews URL with 100 reviews fills 100 rows, a category URL with 50 products fills 50 rows. Each emitted row = one paid output-record event.

## `maxReviewsPerProduct` (type: `integer`):

Maximum number of individual reviews to include in each reviews-URL row. Reviews pages can list hundreds; capping keeps row size manageable for CSV consumers. Set 0 for no limit.

## `maxProductsPerCategory` (type: `integer`):

Maximum number of category-product rows emitted per category URL. Category index pages list up to ~100 products. Set 0 for no limit.

## `maxConcurrency` (type: `integer`):

Reserved for future use. Capterra requires a session-aware sticky-IP pattern, so the actor currently processes URLs sequentially through one warm session for highest pass rate.

## `minConcurrency` (type: `integer`):

Reserved for future use.

## `maxRequestRetries` (type: `integer`):

Maximum number of retries inside the warm session before giving up on a URL.

## `proxy` (type: `object`):

Apify residential proxy works well. The actor uses a sticky session (same IP for the whole run) plus cookie-jar reuse to ride past Cloudflare's deep-page WAF — see the README's 'How it works' section.

## Actor input object example

```json
{
  "startUrls": [
    "https://www.capterra.com/p/137005/SmartPM/",
    "https://www.capterra.com/p/137005/SmartPM/reviews/",
    "https://www.capterra.com/project-management-software/",
    "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/"
  ],
  "maxItems": 1000,
  "maxReviewsPerProduct": 100,
  "maxProductsPerCategory": 50,
  "maxConcurrency": 1,
  "minConcurrency": 1,
  "maxRequestRetries": 8,
  "proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "US"
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://www.capterra.com/p/137005/SmartPM/",
        "https://www.capterra.com/p/137005/SmartPM/reviews/",
        "https://www.capterra.com/project-management-software/",
        "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/"
    ],
    "proxy": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ],
        "apifyProxyCountry": "US"
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("memo23/capterra-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [
        "https://www.capterra.com/p/137005/SmartPM/",
        "https://www.capterra.com/p/137005/SmartPM/reviews/",
        "https://www.capterra.com/project-management-software/",
        "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/",
    ],
    "proxy": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
        "apifyProxyCountry": "US",
    },
}

# Run the Actor and wait for it to finish
run = client.actor("memo23/capterra-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://www.capterra.com/p/137005/SmartPM/",
    "https://www.capterra.com/p/137005/SmartPM/reviews/",
    "https://www.capterra.com/project-management-software/",
    "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/"
  ],
  "proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "US"
  }
}' |
apify call memo23/capterra-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=memo23/capterra-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Capterra 0.9$💰 Reviews Products, Categories & Compare Scraper",
        "description": "[Only 0.9$💰] Capterra all-in-one scraper: product, reviews, category & compare URLs in one actor. Flat per-review rows (100% field fill: reviewer name, company size, recommendation score, incentivization), Capterra awards & badges, sentiment breakdown, structured integrations. Pure HTTP, no browser",
        "version": "0.1",
        "x-build-id": "mmRKl5zSihQJnwkpA"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/memo23~capterra-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-memo23-capterra-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/memo23~capterra-scraper/runs": {
            "post": {
                "operationId": "runs-sync-memo23-capterra-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/memo23~capterra-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-memo23-capterra-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Capterra URLs",
                        "type": "array",
                        "description": "Paste any mix of:\n• Product URLs — `https://www.capterra.com/p/{id}/{slug}/`\n• Reviews URLs — `https://www.capterra.com/p/{id}/{slug}/reviews/`\n• Category URLs — `https://www.capterra.com/{slug}-software/`\n• Compare URLs — `https://www.capterra.com/compare/{ids}/{slug-a}-vs-{slug-b}/`\n\nBare numeric product IDs (e.g. `137005`) and bare category slugs (e.g. `project-management`) also work. Each entry is auto-classified and emits the matching row type — one row per URL.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max dataset rows to emit",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety cap on total dataset rows. Product / compare URLs each emit 1 row. A reviews URL emits one row per individual review (each carrying the thread-level aggregate under `productAggregate`); a category URL emits one row per product listed (each carrying the category-level aggregate under `categoryAggregate`). So a reviews URL with 100 reviews fills 100 rows, a category URL with 50 products fills 50 rows. Each emitted row = one paid output-record event.",
                        "default": 1000
                    },
                    "maxReviewsPerProduct": {
                        "title": "Max reviews per product",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of individual reviews to include in each reviews-URL row. Reviews pages can list hundreds; capping keeps row size manageable for CSV consumers. Set 0 for no limit.",
                        "default": 100
                    },
                    "maxProductsPerCategory": {
                        "title": "Max products per category",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of category-product rows emitted per category URL. Category index pages list up to ~100 products. Set 0 for no limit.",
                        "default": 50
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Reserved for future use. Capterra requires a session-aware sticky-IP pattern, so the actor currently processes URLs sequentially through one warm session for highest pass rate.",
                        "default": 1
                    },
                    "minConcurrency": {
                        "title": "Min concurrency",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Reserved for future use.",
                        "default": 1
                    },
                    "maxRequestRetries": {
                        "title": "Max request retries",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of retries inside the warm session before giving up on a URL.",
                        "default": 8
                    },
                    "proxy": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify residential proxy works well. The actor uses a sticky session (same IP for the whole run) plus cookie-jar reuse to ride past Cloudflare's deep-page WAF — see the README's 'How it works' section.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
