Capterra 0.9$πŸ’° Reviews Products, Categories & Compare Scraper avatar

Capterra 0.9$πŸ’° Reviews Products, Categories & Compare Scraper

Pricing

from $0.90 / 1,000 results

Go to Apify Store
Capterra 0.9$πŸ’° Reviews Products, Categories & Compare Scraper

Capterra 0.9$πŸ’° Reviews Products, Categories & Compare Scraper

[Only 0.9$πŸ’°] Capterra all-in-one scraper: product, reviews, category & compare URLs in one actor. Flat per-review rows (100% field fill: reviewer name, company size, recommendation score, incentivization), Capterra awards & badges, sentiment breakdown, structured integrations. Pure HTTP, no browser

Pricing

from $0.90 / 1,000 results

Rating

0.0

(0)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Capterra Software Reviews, Products & Categories Scraper

How it works

How it works

All-in-one Capterra.com scraper β€” paste any mix of URL kinds in the same input list and the actor auto-classifies each one:

InputRow(s) emitted
Product URL β€” /p/{id}/{slug}/1 product row: name, description, application category, rating, vendor, pricing, features, screenshots, alternatives, review snippets, awards & badges (Best Value, Shortlist, Ease of Use), structured integrations (with logos + Capterra URLs), per-sub-rating averages with reviewer counts, Capterra-computed sentiment breakdown
Reviews URL β€” /p/{id}/{slug}/reviews/N review rows β€” one per individual review (rowType: "review"). Each row carries a nested productAggregate object with the thread-level rating distribution, sub-rating averages, total reviews, etc. No separate summary row.
Category URL β€” /{slug}-software/ (e.g. /project-management-software/)N category-product rows β€” one per product in the category (rowType: "category-product"). Each row carries a nested categoryAggregate with the category name, description, total count, and related categories. Same flat-with-nested-aggregate pattern as reviews.
Compare URL β€” /compare/{ids}/{slug-a}-vs-{slug-b}/1 compare row: side-by-side products with ratings, pricing, deployments

Bare numeric product IDs (137005) and bare category slugs (project-management) also work. JSON + CSV.

Reviews are flat with the aggregate denormalized into every row. Every dataset row from a reviews URL is a single review. The thread-level aggregate (rating distribution, sub-rating averages, total reviews) lives under each row's productAggregate field β€” duplicated on every review row from the same URL, but that keeps each row self-contained and makes billing trivial: one row = one paid outputrecord event.

Pure HTTP. No Puppeteer, no Playwright, no headless Chromium, no third-party Cloudflare-bypass service.


Capterra is Cloudflare-gated with stronger deep-page protection than most B2B-software sites. The actor breaks past it without a browser via a session-aware HTTP pattern:

  1. Warm session. Open one sticky-IP residential proxy session, hit a known-good product page, retry until 200. Cloudflare sets __cf_bm + _cfuvid cookies on success β€” captured automatically by the impit cookie jar.
  2. Iterate URLs. Reuse the same Impit instance + sticky IP for every subsequent URL. Cookie + IP persistence pushes per-URL pass rate from a cold ~25% to a warm ~80%, with per-URL retry bringing cumulative success to ~95%.
  3. Auto-refresh. On 3 consecutive 403s, drop the session and re-warm with a fresh IP. No human intervention needed.

Every page is then parsed via JSON-LD (the rock-solid SoftwareApplication schema + BreadcrumbList) plus targeted Cheerio walks of the visible HTML for sections (Features, Pricing, Integrations, Support, Alternatives, User reviews). No DOM execution; no DOMContentLoaded wait.


Input

FieldTypeRequiredNotes
startUrlsstring[]yesAny mix of product / reviews / category / compare URLs (or bare product IDs / category slugs). Auto-classified.
maxItemsintegernoSafety cap on total dataset rows, not URLs. A reviews URL with 100 reviews emits 100 review rows; a category URL with 50 products emits 50 category-product rows. Default 1000. Free-tier users are capped at 100.
maxReviewsPerProductintegernoMax individual review rows emitted per reviews URL. Default 100, 0 = no limit.
maxProductsPerCategoryintegernoMax individual category-product rows emitted per category URL. Default 50, 0 = no limit.
maxRequestRetriesintegernoPer-URL retry budget inside the warm session. Default 8.
proxyobjectnoApify residential proxy recommended (US country). Required because Capterra rejects datacenter IPs.

Example input

{
"startUrls": [
"https://www.capterra.com/p/137005/SmartPM/",
"https://www.capterra.com/p/137005/SmartPM/reviews/",
"https://www.capterra.com/project-management-software/",
"https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/"
],
"maxItems": 50,
"maxReviewsPerProduct": 100,
"maxProductsPerCategory": 50,
"proxy": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"], "apifyProxyCountry": "US" }
}

That yields 90 rows on SmartPM: 1 product + 46 individual review rows (each with productAggregate nested) + 42 category-product rows from /project-management-software/ (each with categoryAggregate nested) + 1 compare. Use maxProductsPerCategory / maxReviewsPerProduct to cap the per-URL fan-out.


Output schema

Four row shapes, distinguished by the rowType field: product, review, category-product, compare.

rowType: "product"

{
"rowType": "product",
"productUrl": "https://www.capterra.com/p/137005/SmartPM/",
"productId": "137005",
"name": "SmartPM",
"description": "SmartPM is an AI-powered Automated Project Controlsβ„’ platform built by construction for construction…",
"applicationCategory": "BusinessApplication",
"categories": ["Construction Software", "Project Management Software"],
"rating": {
"average": 4.9,
"count": 46,
"bestRating": 5
},
"vendor": {
"name": "SmartPM",
"websiteUrl": null
},
"pricingStarts": "$49",
"logoUrl": "https://gdm-catalog-fmapi-prod.imgix.net/ProductLogo/7c88b5d2-…",
"screenshotUrls": [ /* up to ~22 imgix URLs */ ],
"features": [ /* visible feature list */ ],
"deployments": [ "Cloud", "Web-Based" ],
"trainingOptions": [ "Live Online", "Documentation", "Webinars" ],
"supportOptions": [ "Email/Help Desk", "Phone Support", "Chat" ],
"customerTypes": [ "Small Business", "Mid-size", "Enterprise" ],
"reviewSnippets": [
{
"title": "SmartPM, an amazing tool for schedule analysis",
"pros": "I had a 2 year experience using SmartPM…",
"cons": "Some advanced features have a learning curve…",
"overallRating": null,
"reviewer": { "name": null, "jobTitle": null, "industry": null, "companySize": null },
"source": "capterra"
}
/* up to 10 surfaced on the product page */
],
"alternatives": [
{ "productId": "183213", "productUrl": "https://www.capterra.com/p/183213/Contractor-Foreman/", "name": "Contractor Foreman", "rating": { "average": null, "count": null, "bestRating": null }, "vendor": null, "logoUrl": null, "pricingStarts": null, "sponsored": null }
/* + ~4 more */
],
"breadcrumbs": [
{ "name": "Capterra", "url": "https://www.capterra.com/" },
{ "name": "Construction Software", "url": "https://www.capterra.com/construction-software/" },
{ "name": "SmartPM", "url": "https://www.capterra.com/p/137005/SmartPM/" }
],
// ── RSC-derived rich data (from Capterra's embedded product JSON) ──
"awards": [
{
"title": "2025 Capterra Best Value for Money",
"badgeUrl": "/public-bx-capterra-v0/badge-best-value.png",
"badgeAlt": "Best Value Badge",
"additionalInfo": "and in",
"linkText": "+4 categories",
"linkAction": "best-value"
}
],
"valueBadges": [
{ "name": "Business Intelligence", "year": "2025", "categorySlug": "business-intelligence-software" },
{ "name": "Construction Management", "year": "2025", "categorySlug": "construction-management-software" },
{ "name": "Data Analysis", "year": "2025", "categorySlug": "data-analysis-software" },
{ "name": "Project Planning", "year": "2025", "categorySlug": "project-planning-software" }
],
"shortlistBadges": [], // populated when product is in Capterra Shortlist reports
"easeOfUseBadges": [], // populated when product holds "Best Ease of Use" awards
"integrations": [
{
"name": "Microsoft Power BI",
"id": "e3df9c2d-1567-427c-acb2-a6d200b52931",
"logo": "https://gdm-catalog-fmapi-prod.imgix.net/ProductLogo/e48bb429-…png",
"href": "https://www.capterra.com/p/176586/Power-BI/",
"reviewCount": 1878
}
/* + 7 more */
],
"usedFor": [
{ "name": "Construction Scheduling", "slug": "construction-scheduling-software" },
{ "name": "Construction Management", "slug": "construction-management-software" }
/* + 6 more */
],
"pricing": {
"startingPrice": null, // raw value β€” may be blank when not disclosed
"startingPriceCurrency": "$$",
"startingPriceUnit": "",
"pricingModel": "Other",
"paymentFrequency": "Per Month",
"hasFreeTrial": true,
"hasFreeVersion": false
},
"subRatingAverages": { // Capterra's authoritative per-sub-rating means + counts
"easeOfUse": { "rating": 4.7, "reviews": 46 },
"customerService": { "rating": 5, "reviews": 45 },
"valueForMoney": { "rating": 4.9, "reviews": 42 }
},
"reviewSentiment": { // Capterra-computed sentiment breakdown
"totalReviews": 46,
"overallRating": 4.87,
"positive": { "count": 96, "percentage": 96 },
"neutral": { "count": 4, "percentage": 4 },
"negative": { "count": 0, "percentage": 0 }
},
"scrapedAt": "2026-05-12T13:35:00.412Z"
}

rowType: "review" (one per individual review)

Every dataset row from a reviews URL is a single review. Product info (productId, productName, reviewsUrl, productUrl, reviewUrl) and the thread-level aggregate (productAggregate) are denormalized onto every row, so each row is fully self-contained β€” no joining required for CSV consumers.

Per-review fields come directly from Capterra's embedded RSC JSON (one structured object per review, decoded brace-by-brace), giving 100% fill on every field the source provides: reviewer name / job title / industry / company size / profile pic / LinkedIn verification, all four sub-ratings, recommendation score (0-10), incentivization status, plus structured alternativeProducts[] and switchedProducts[] arrays for the competitor products each reviewer mentioned, plus the vendor's reply when one exists.

The actor auto-paginates through every review page (following each page's ?page=N link). productAggregate.totalReviews and productAggregate.pagesScraped reflect the full thread up to maxReviewsPerProduct. On SmartPM that's 46 reviews across 2 pages; on bigger products it can be hundreds across 10+ pages.

{
"rowType": "review",
"reviewsUrl": "https://www.capterra.com/p/137005/SmartPM/reviews/",
"productId": "137005",
"productName": "SmartPM",
// ── Per-review fields ────────────────────────────────────────────
"reviewId": "Capterra___6325867", // stable primary key
"reviewUrl": "https://www.capterra.com/p/137005/x/#Capterra___6325867",
"productUrl": "https://www.capterra.com/p/137005/x/",
"title": "SmartPM Elevates Company Schedule Performance & Health",
"overallRating": 5,
"recommendationScore": 10, // 0–10 NPS, when provided
"subRatings": { "easeOfUse": 5, "customerService": 5, "features": 5, "valueForMoney": 5 },
"text": "Our SmartPM experience has been fantastic - integration, support, implementation - everything!",
"pros": "SmartPM has already proved itself to be invaluable to our company…",
"cons": "Some advanced features have a learning curve…",
"adviceToOthers": null, // Optional "Advice to others" text
"reasonsForChoosing": null, // "Reasons for Choosing" (~20% fill)
"reasonsForSwitching": null, // "Reasons for Switching" (~11% fill)
"switchedFrom": "Deltek Acumen", // Name of previous tool (= switchedProducts[0].productName)
"switchedProducts": [ // Full structured list (rare β€” 4% fill on SmartPM)
{ "productId": "10001903", "productSlug": "Deltek-Acumen", "productName": "Deltek Acumen" }
],
"alternativeProducts": [ // Competitors the reviewer ALSO evaluated (9% fill on SmartPM)
{ "productId": "10012584", "productSlug": "Schedule-Validator", "productName": "Schedule Validator" }
],
"publishedAt": "May 31, 2024",
"timeUsed": "1-2 years", // "2+ years" / "6-12 months" / "Less than 6 months" / "1-2 years"
"frequencyOfUse": null,
"reviewer": {
"name": "John M.", // null when reviewer is anonymous ("Verified User")
"jobTitle": "Scheduler",
"industry": "Construction",
"companySize": "201-500 employees", // "Self-employed" / "1-10 employees" / … / "10,001+ employees"
"profilePicUrl": "https://reviews.capterra.com/cdn/profile-images/linkedin/e7e3268d2b...jpeg",
"verifiedLinkedIn": false,
"validationsPassed": ["ProofOfLink"] // Methods Capterra used to validate the reviewer
},
"verified": true, // Capterra "isValidated" flag
"incentivization": "vendor-referred-incentive", // "none" | "vendor-referred" | "nominal-gift" | "vendor-referred-incentive" | "unknown"
"source": "capterra:incentivized", // back-compat string form of `incentivization`
"reviewSourceTooltip": "Vendor Referred - Incentive Offered: This reviewer was invited by the software vendor…",
"vendorResponse": { // null when the vendor didn't reply (typical)
"date": "July 11, 2022",
"text": "Thanks Joe! It's been great working with you!"
},
// ── Thread-level aggregate (same on every review row from the same URL) ──
"productAggregate": {
"rating": { "average": 4.9, "count": 46, "bestRating": 5 },
"subRatingAverages": { "easeOfUse": 4.7, "customerService": 5, "features": 4.8, "valueForMoney": 4.9 },
"ratingDistribution": [
{ "stars": 5, "count": 21, "percentage": 84 },
{ "stars": 4, "count": 2, "percentage": 8 },
{ "stars": 3, "count": 2, "percentage": 8 },
{ "stars": 2, "count": 0, "percentage": 0 },
{ "stars": 1, "count": 0, "percentage": 0 }
],
"commonPros": [],
"commonCons": [],
"totalReviews": 46,
"pagesScraped": 2
},
"scrapedAt": "2026-05-12T10:55:14.123Z"
}

rowType: "category-product" (one per product listed in the category)

Same flat-with-nested-aggregate pattern as reviews. Every dataset row from a category URL is a single product listed in that category. The category-level aggregate (name, slug, description, total count, related categories) is denormalized under categoryAggregate on every row.

{
"rowType": "category-product",
"categoryUrl": "https://www.capterra.com/project-management-software/",
"categorySlug": "project-management",
// ── Per-product fields (different on every row) ──────────────────
"productId": "120390",
"productUrl": "https://www.capterra.com/p/120390/Teamwork-Projects/",
"name": "Teamwork Projects",
"description": null,
"rating": { "average": 4.5, "count": 1234, "bestRating": 5 },
"vendor": null,
"logoUrl": "https://gdm-catalog-fmapi-prod.imgix.net/…",
"pricingStarts": null,
"sponsored": false,
// ── Category-level aggregate (same on every category-product row from the same URL) ──
"categoryAggregate": {
"categoryUrl": "https://www.capterra.com/project-management-software/",
"categorySlug": "project-management",
"categoryName": "Project Management Software",
"description": "Project management software helps teams plan, track, and deliver work …",
"totalProductsOnPage": 42,
"relatedCategories": [
{ "name": "Task Management Software", "url": "https://www.capterra.com/task-management-software/" }
/* up to 30 */
]
},
"scrapedAt": "2026-05-12T..."
}

rowType: "compare"

{
"rowType": "compare",
"compareUrl": "https://www.capterra.com/compare/137005-229614/SmartPM-vs-Notion/",
"products": [
{ "productId": "137005", "name": "SmartPM", "rating": null, "vendor": null, ... },
{ "productId": "229614", "name": "Notion", "rating": null, "vendor": null, ... }
],
"summary": {
"ratings": [ { "average": 4.9, "count": 46, "bestRating": 5 }, { "average": 4.7, "count": 5000, "bestRating": 5 } ],
"pricingStarts": [ "$49", "Free" ],
"vendors": [ "SmartPM", "Notion Labs" ],
"deployments": [ ["Cloud"], ["Cloud", "Mobile"] ]
}
}

Compare rows are populated by per-product fetch. Capterra's /compare/... pages are 100% client-rendered SPAs β€” their HTML carries no product data. Instead we read the product IDs from the URL itself and fetch each compared product's own page through the warm session, giving full ratings + vendor + pricing + deployments per side-by-side product. Cost: N extra HTTP calls per compare URL (where N = number of products compared, usually 2–3).


Pricing

Pay-per-event. Both reviews and category products are billed flat β€” every review row is one outputrecord, every category-product row is one outputrecord. Cost = rows Γ— outputrecord-price. No nested-array arithmetic.

EventWhenSuggested rate
outputrecordOnce per dataset row pushed β€” product, review, category-product, compare. The thread/category aggregate is nested INSIDE each flat row, not charged separately.configured on the actor page
additional-dataOnce per nested item inside a product or compare row (features, screenshots, alternatives, breadcrumbs, compared products). Reviews and category-products are NEVER billed here β€” each is its own paid outputrecord row.$0.75 per 1,000 items ($0.00075 each)

What counts as additional-data

Only the URL kinds that still keep nested arrays bill here. Reviews and category products never do.

Row typeItems billed as additional-data
Product rowevery feature + screenshot + review snippet + alternative + breadcrumb category
Compare rowevery compared product

What makes this richer than the competition

We surveyed all 16 Capterra actors on the Apify marketplace (May 2026). Most cover one URL surface only:

CapabilityCompetitor actorsThis actor
Product + reviews + category + compare in one actor❌ (split across 4+ SKUs)βœ…
URL auto-classification (paste any mix)βŒβœ…
Pure HTTP, no headless browsermixed (some use Playwright)βœ…
No paid third-party CF-bypass servicemixedβœ…
Cookie-warmed sticky session patternβŒβœ…
Per-product alternatives + screenshots in product row❌ (separate calls)βœ…
Flat-per-review billing (1 review = 1 paid row, no nested-array math)mixedβœ…
Thread-level aggregate (rating distribution + sub-rating averages) on every review row❌ (none surface this at all)βœ…
RSC-stream embedded review JSON extraction β€” 100% fill on reviewId, companySize, recommendationScore, incentivization, etc.partial (some fields 0%)βœ…
Capterra awards & badges (Best Value, Shortlist, Ease-of-Use) per productβŒβœ…
Structured integrations per product (name + logo + Capterra URL + review count, not just feature strings)βŒβœ…
Capterra-computed review sentiment breakdown (positive / neutral / negative %)βŒβœ…
Per-sub-rating averages with reviewer counts (more precise than mean alone)partialβœ…

The top marketplace actor (focused_vanguard/multi-platform-reviews-scraper, 220 users) covers multiple sources (Capterra + G2 + Trustpilot + Gartner + Software Advice + Reddit) but only emits review data. We do everything Capterra exposes in one actor β€” products, reviews, categories, comparisons β€” without crossing into other platforms.


Notes & limitations

  • Capterra routes by numeric product ID β€” the slug in the URL is decorative. /p/137005/Slack/ redirects to whatever product 137005 actually is (currently SmartPM).
  • Cloudflare gating. We rely on residential-proxy IP rotation + cookie reuse to slip past CF's WAF. With Apify residential pool (US country), expect ~95% per-URL success on a normal-cost run.
  • Reviews pagination is automatic β€” the actor follows ?page=N links from each page's own HTML (not constructed URLs, since /x/ placeholder slugs lose query-strings on redirect). Stops when no next-page link found or maxReviewsPerProduct is hit. Each individual review is then emitted as its own flat dataset row with the thread aggregate nested in productAggregate.
  • Category flattening. Same pattern as reviews β€” every product listed in a category becomes its own dataset row with categoryAggregate nested. A category URL with 50 products fills 50 rows of the maxItems budget (capped by maxProductsPerCategory).
  • Compare extraction uses per-product fetch β€” Capterra's compare pages are SPAs, so we read product IDs from the URL and fetch each compared product's full page through the warm session. Result: rich compare rows with real ratings, vendor info, pricing, and deployments per product.
  • No mobile app. Capterra is web-only; no API to bypass the WAF via.

Support

Issues, feature requests, or custom output shapes? Open a ticket on the actor's Apify page or message the maintainer.