# G2 Software Reviews Scraper — Ratings, Pros/Cons & Confidence (`bovi/g2-scraper`) Actor

Scrape G2.com software reviews for any product. Full fields per review: star rating, title, review body, pros, cons, reviewer role, company size and date. Resilient structural parser keyed on schema.org review microdata — survives CSS churn. parse\_confidence per record for drift detection.

- **URL**: https://apify.com/bovi/g2-scraper.md
- **Developed by:** [Vitalii Bondarev](https://apify.com/bovi) (community)
- **Categories:** Lead generation, Marketing
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.87 / 1,000 review items

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## G2 Software Reviews Scraper

Scrape software reviews from [G2.com](https://www.g2.com) — the leading B2B software review platform.
Extract structured review data including ratings, pros/cons, reviewer role, and company size.

### What you get

| Field | Description |
|---|---|
| `product_name` | G2 product display name |
| `product_slug` | URL slug (e.g. `notion`, `hubspot-marketing-hub`) |
| `review_id` | Unique G2 review ID |
| `rating` | Star rating (1–5) |
| `title` | Review headline |
| `text` | Full review body |
| `reviewer_name` | Reviewer display name |
| `reviewer_role` | Job title (B2B-critical field for buyer intent analysis) |
| `company_size` | Company size bracket (e.g. `51-100`, `201-500`) |
| `pros` | Pros text (G2 signature structured field) |
| `cons` | Cons text (G2 signature structured field) |
| `review_date` | ISO 8601 publish date |
| `url` | Direct link to review on G2 |
| `parse_confidence` | Per-record parse quality (0.0–1.0) for drift detection |
| `warnings` | Machine-readable warning codes |

### Use cases

- **Competitive intelligence** — monitor competitor reviews; track sentiment changes over time
- **CRM enrichment** — identify leads by role/company size from verified buyer reviews
- **Product research** — extract structured pros/cons for feature analysis
- **Market research** — aggregate review sentiment across software categories
- **Sales enablement** — understand objections (cons) at scale

### Proxy requirement

> ⚠️ **Residential proxy required.** G2 blocks datacenter IPs with HTTP 403. Configure **Apify Residential proxy** in the input.
> Without a proxy, all runs will fail.

This is the same requirement as other review platforms (Trustpilot, Booking.com).
The **buyer pays proxy costs** as part of their Apify actor usage budget.

### Input

```json
{
  "productSlugs": ["notion", "hubspot-marketing-hub"],
  "maxReviews": 100,
  "sort": "most_helpful",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}
````

#### Finding the product slug

The slug is the last part of any G2 product URL:

- `https://www.g2.com/products/notion/reviews` → slug is `notion`
- `https://www.g2.com/products/hubspot-marketing-hub/reviews` → slug is `hubspot-marketing-hub`
- `https://www.g2.com/products/salesforce/reviews` → slug is `salesforce`

#### Sort options

| Value | Description |
|---|---|
| `most_helpful` | Highest-quality reviews (default, best for analysis) |
| `most_recent` | Newest reviews first (best for monitoring) |
| `highest_rated` | 5-star reviews first |
| `lowest_rated` | 1-star reviews first (surface pain points) |

### Output example

```json
{
  "product_name": "Notion",
  "product_slug": "notion",
  "product_url": "https://www.g2.com/products/notion/reviews",
  "review_id": "abc-review-123",
  "rating": 5,
  "title": "Best collaboration tool we have used",
  "text": "We switched from Confluence and never looked back...",
  "reviewer_name": "Sarah M.",
  "reviewer_role": "Product Manager",
  "company_size": "51-100",
  "pros": "Extremely flexible, great templates, excellent for documentation",
  "cons": "Can be slow with large databases, search could be better",
  "review_date": "2024-03-15T10:00:00.000Z",
  "url": "https://www.g2.com/reviews/abc-review-123",
  "parse_confidence": 1.0,
  "warnings": []
}
```

### Parse confidence

Every record includes `parse_confidence` (0.0–1.0). This is our reliability edge:

- **1.0** — all core fields extracted successfully
- **0.7–0.99** — some optional fields missing (normal)
- **< 0.5** — structural issue (G2 schema change) — check warnings

Use `parse_confidence < 0.8` as a filter for high-quality data, or to detect if G2 changed their page structure.

### Pricing

Pay-per-result (PPE): **$2.00 per 1,000 reviews**.

G2 shows 20 reviews per page, so a 100-review run = 5 page fetches.

### Technical notes

- **Parser approach:** Extracts `__NEXT_DATA__` JSON embedded in G2's Next.js HTML — not fragile CSS class scraping
- **Pagination:** Uses G2's `/_next/data/` JSON API for pages 2+ (no HTML parsing overhead)
- **Rate limiting:** 3-second delay between pages; automatic retry on transient blocks
- **Schema stability:** JSON key paths are more stable than CSS class names across G2 UI updates

### Limitations

1. **Residential proxy required** — datacenter IPs are blocked
2. **20 reviews per page** — 10,000 reviews = 500 proxy requests
3. **Not affiliated with G2** — this actor scrapes public review data from G2.com

### Integrations

Built for B2B marketers and product teams mining competitor reviews, ratings, and buyer-persona signals on G2 — the JSON/dataset output drops into the tools you already run, no glue code:

- **n8n / Make / Zapier** — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: [n8n](https://docs.apify.com/platform/integrations/n8n), [Make](https://docs.apify.com/platform/integrations/make), [Zapier](https://docs.apify.com/platform/integrations/zapier).
- **Webhooks** — fire your own endpoint the moment a run finishes, to push results straight into your pipeline ([docs](https://docs.apify.com/platform/integrations/webhooks)).
- **MCP server** — expose this actor as a tool to Claude, Cursor, or any [MCP client](https://mcp.apify.com) so an AI agent can pull this data mid-conversation ([guide](https://blog.apify.com/how-to-use-mcp/)).
- **API & SDKs** — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.

See all [Apify integrations](https://apify.com/integrations).

### Disclaimer

This actor scrapes publicly available data from G2.com. Use in compliance with G2's Terms of Service and applicable data protection laws.

# Actor input Schema

## `productSlugs` (type: `array`):

List of G2 product slugs (e.g. 'notion', 'hubspot-marketing-hub') or full G2 review URLs (https://www.g2.com/products/notion/reviews). Find the slug in the G2 URL for any product.

## `maxReviews` (type: `integer`):

Maximum number of reviews to scrape per product. Set to 0 to scrape all reviews (may take several minutes for products with thousands of reviews).

## `proxyConfiguration` (type: `object`):

REQUIRED: G2 needs a residential proxy for reliable access — datacenter and home IPs are rate-limited. Configure Apify Residential proxy group — runs will fail without a residential proxy.

## `sort` (type: `string`):

How to sort reviews. 'most\_helpful' returns highest-quality reviews first (recommended for analysis).

## Actor input object example

```json
{
  "productSlugs": [
    "notion",
    "hubspot-marketing-hub",
    "salesforce"
  ],
  "maxReviews": 500,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "sort": "most_helpful"
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset containing G2 Scraper records (product\_name, rating, title, reviewer\_name, reviewer\_role, company\_size, review\_date, url, product\_url, parse\_confidence).

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "productSlugs": [
        "notion"
    ],
    "maxReviews": 100,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    },
    "sort": "most_helpful"
};

// Run the Actor and wait for it to finish
const run = await client.actor("bovi/g2-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "productSlugs": ["notion"],
    "maxReviews": 100,
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
    "sort": "most_helpful",
}

# Run the Actor and wait for it to finish
run = client.actor("bovi/g2-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "productSlugs": [
    "notion"
  ],
  "maxReviews": 100,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "sort": "most_helpful"
}' |
apify call bovi/g2-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=bovi/g2-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "G2 Software Reviews Scraper — Ratings, Pros/Cons & Confidence",
        "description": "Scrape G2.com software reviews for any product. Full fields per review: star rating, title, review body, pros, cons, reviewer role, company size and date. Resilient structural parser keyed on schema.org review microdata — survives CSS churn. parse_confidence per record for drift detection.",
        "version": "0.1",
        "x-build-id": "2ol1THAP2uHhy657d"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/bovi~g2-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-bovi-g2-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/bovi~g2-scraper/runs": {
            "post": {
                "operationId": "runs-sync-bovi-g2-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/bovi~g2-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-bovi-g2-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "productSlugs"
                ],
                "properties": {
                    "productSlugs": {
                        "title": "Product slugs or G2 review URLs",
                        "type": "array",
                        "description": "List of G2 product slugs (e.g. 'notion', 'hubspot-marketing-hub') or full G2 review URLs (https://www.g2.com/products/notion/reviews). Find the slug in the G2 URL for any product.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxReviews": {
                        "title": "Max reviews per product",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of reviews to scrape per product. Set to 0 to scrape all reviews (may take several minutes for products with thousands of reviews).",
                        "default": 100
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "REQUIRED: G2 needs a residential proxy for reliable access — datacenter and home IPs are rate-limited. Configure Apify Residential proxy group — runs will fail without a residential proxy."
                    },
                    "sort": {
                        "title": "Sort reviews by",
                        "enum": [
                            "most_helpful",
                            "most_recent",
                            "highest_rated",
                            "lowest_rated"
                        ],
                        "type": "string",
                        "description": "How to sort reviews. 'most_helpful' returns highest-quality reviews first (recommended for analysis).",
                        "default": "most_helpful"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
