G2 & Capterra Review Intelligence API | B2B Review Analytics
Pricing
from $10.00 / 1,000 results
G2 & Capterra Review Intelligence API | B2B Review Analytics
Collect and normalize G2 and Capterra software reviews into one schema with product metadata, ratings, pros/cons, review samples, and structural change warnings.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Monitor competitor software and track brand reputation by extracting structured data from G2 and Capterra review pages. This advanced web scraper normalizes user feedback from the top B2B platforms into a single, cohesive dataset. By simply providing your target URLs, the tool identifies the source website and extracts precise details like overall product ratings, individual review text, and reviewer backgrounds.
Data teams and market researchers use this API to schedule recurring extraction jobs, automatically feeding fresh customer sentiment into their data pipelines. When you run this scraper, its built-in resilience ensures you never ingest corrupted data. If a browser block is encountered or the target page alters its layout, the tool generates explicit structural change warnings. Even when full text is obscured, it gracefully degrades to salvage essential metadata, keeping your automated workflows running smoothly.
Leverage the scraped results to fuel machine learning models, populate internal business intelligence tools, or compile competitor analysis reports. The unified schema guarantees that whether the data originates from G2 or Capterra, the output fields remain perfectly consistent. Extract actionable B2B intelligence at scale without constantly repairing brittle scraping scripts, and maintain a clear view of your industry's software landscape.
Store Quickstart
- Start with 1–5 review URLs and a modest
reviewLimit(25–50). - Mix G2 and Capterra in one run only after you validate the schema on a small sample.
- Use
dryRun: truefor validation-only checks before larger monitoring runs.
Features
- Dual-source support: Accepts both G2 (
g2.com/products/*/reviews) and Capterra (capterra.com/p/*/reviews/) URLs - Automatic source detection: Classifies each URL and routes to the correct parser
- Normalized output: Unified review schema across both platforms — product metadata, ratings, individual reviews with pros/cons, rating breakdowns
- Structural change warnings: Emits explicit warnings when page shapes change or access is blocked, rather than silently returning bad data
- Graceful degradation: Returns partial results (metadata without reviews) when possible, with clear status indicators
Use Cases
| Who | Why |
|---|---|
| B2B product marketers | Benchmark category leaders and review themes across two major software-review sites |
| RevOps / sales enablement | Track competitor proof points, objections, and complaint patterns |
| Analysts | Build one normalized review dataset across G2 and Capterra |
| Agencies | Monitor multiple software products with a reusable schema |
Input
| Field | Type | Default | Description |
|---|---|---|---|
reviewPageUrls | string[] | (required) | G2 or Capterra review page URLs |
reviewLimit | integer | 200 | Max reviews to collect per product (1–5000) |
delivery | string | "dataset" | "dataset" or "webhook" |
webhookUrl | string | Webhook URL for delivery mode | |
dryRun | boolean | false | Validate input without fetching |
Output
Each product in the products array contains:
source—"g2"or"capterra"productName,overallRating,totalReviewCount,ratingBreakdownvendorName,categoryName(when available)reviews[]— normalized reviews withtitle,rating,date,author,body,pros,cons,verifiedstatus—"ok","partial","blocked", or"error"fetchError— error details if fetch failed
The meta section reports implementationStatus: "live", "partial", "degraded", or "no_valid_sources".
Output Example
{"source": "g2","status": "ok","sourceUrl": "https://www.g2.com/products/notion/reviews","productName": "Notion","vendorName": "Notion Labs","overallRating": 4.5,"totalReviewCount": 4321,"ratingBreakdown": { "5": 1200, "4": 800, "3": 300, "2": 50, "1": 10 },"reviews": [{"title": "Great tool","rating": 5,"author": "Alice","pros": "Fast setup","cons": "Needs offline mode","verified": true}],"warnings": []}
Parsing Strategy
- JSON-LD (structured data in
<script type="application/ld+json">) — most stable - Embedded app data (e.g.,
__NEXT_DATA__for Capterra) — second layer - Meta tags (
og:title) — product name fallback - HTML pattern matching — last resort for reviews
This layered approach means the actor degrades gracefully as sites change markup.
Local Run
# Edit input.json with your URLs, then:npm start# Dry run (validation only, no network):# Set "dryRun": true in input.jsonnpm start
Tests
$npm test
Limitations
- Reviews are extracted from server-rendered HTML; sites that render reviews only via client-side JavaScript may return partial results
- G2 and Capterra actively protect against automated access — blocked requests are reported honestly
- HTML parsing is inherently fragile; the actor emits warnings when expected patterns are missing
Related Actors
Pair this actor with other flagship intelligence APIs in the same portfolio:
- Trustpilot Review Intelligence API — add broader company reputation and reply-rate signals.
- Google Maps Review Intelligence API — compare software-brand review data with location sentiment when relevant.
- Shopify Store Intelligence API — enrich vendor or competitor research with public storefront and catalog signals.
- YouTube Channel Analytics API — combine review themes with public content cadence and audience-facing proof.
Pricing & Cost Control
Apify Store pricing is usage-based, so total cost mainly follows how many reviewPageUrls you process and how deep you sample reviews. Check the Store pricing card for the current per-event rates.
- Keep
reviewPageUrlsbatches small while validating source mix. - Lower
reviewLimitfor faster exploratory runs. - Use dataset delivery first to inspect
blockedorpartialresults. - Use
dryRun: truebefore scaling to scheduled monitoring.
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.
Bug report or feature request? Open an issue on the Issues tab of this actor.