# Trustpilot Review Scraper - AI Reputation Monitor (`harvestlab/trustpilot-scraper`) Actor

Trustpilot review scraper for reputation monitoring: reviews, ratings, owner replies, AI sentiment/topics, rating-drop tracking, probe-mode block diagnostics, and user-controlled bypass hooks. Experimental anti-bot tier. x402-ready PPE at $0.003/review.

- **URL**: https://apify.com/harvestlab/trustpilot-scraper.md
- **Developed by:** [Nick](https://apify.com/harvestlab) (community)
- **Categories:** Marketing, Business, MCP servers
- **Stats:** 1 total users, 1 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 review scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Trustpilot Review Scraper - AI Reputation Monitor

> **Watchlist status - May 2026:** Trustpilot may block automated access. The actor includes probe mode and clear `block_detected: true` diagnostics so production workflows can detect block state before relying on a run.

**$0.003/review replaces $299/mo Birdeye.** Scrape Trustpilot reviews with AI sentiment analysis, rating trend tracking, and Apify webhook-friendly output. Built for brand managers, marketing agencies, investors, and product teams who need structured reputation data at scale -- with no monthly subscription.

### Key Features

- **AI sentiment analysis and topic tags** -- Powered by your choice of 5 LLM providers (OpenRouter, Anthropic, Google AI, OpenAI, Ollama). Extracts sentiment scores, praise themes, complaint themes, differentiators, and actionable recommendations from any batch of reviews.
- **Rating-drop workflows via Apify webhooks** -- Use Apify run-finished webhooks, n8n, Make, or Zapier to push negative review alerts into Slack, create HubSpot tickets, or trigger downstream workflows when a run completes.
- **Commercial anti-bot bypass hook** -- Route fetches through ScrapingBee or Bright Data Web Unlocker when Trustpilot's Cloudflare Enterprise wall blocks direct requests. Bring your own API key; the actor charges $0.015 per provider call.
- **Probe mode** -- Run a single $0.0005 health-check fetch before launching an expensive batch. Records the result to the 7-day rolling telemetry store so your pipeline can skip a batch when the block rate has spiked.
- **7-day rolling success telemetry** -- Every run records its block/success outcome to a named Apify KV store. The rolling rate is surfaced at run start so you see the real odds before spending on a batch.
- **5 LLM providers** -- OpenRouter (300+ models, recommended), Anthropic (Claude), Google AI (Gemini), OpenAI (GPT), or Ollama (self-hosted, zero API cost).
- **Domain auto-lookup** -- Enter any company domain (e.g., "revolut.com") and the actor builds the Trustpilot URL automatically. No need to look up profiles manually.
- **Full review extraction** -- Scrapes reviewer name, star rating, title, full review text, date, `review_age_days`, and `has_company_response` with automatic pagination through all available review pages.
- **Business profile data** -- Extracts TrustScore, overall rating, total review count, and business categories from each company profile.
- **Multi-language AI output** -- Generate analysis in English, Dutch, German, French, Spanish, Italian, or Portuguese.
- **Structured JSON output** -- Clean, machine-readable data ready for dashboards, CRM imports, spreadsheets, and automated workflows.

### Competitor Comparison

| Tool | Price | AI Analysis | Contract |
|------|-------|-------------|----------|
| **harvestlab/trustpilot-scraper (this)** | **$0.003/review** | **Yes -- 5 LLM providers** | **No subscription** |
| Birdeye | $299/mo | Limited | Annual contract |
| Podium | $289/mo | No | Monthly |
| Reputology | ~$79/mo | No | Monthly |
| Grade.us | $110/mo | No | Monthly |
| ReviewTrackers | $49/mo | No | Monthly |
| kaix/trustpilot-reviews-scraper | $0.00004/review | No | Pay-per-event |
| automation-lab/trustpilot | $0.0002/review (tiered) | No | Pay-per-event |

Our $0.003/review is higher than raw-data competitors but bundles multi-provider AI reputation analysis that none of them ship. If you only need raw data, kaix or automation-lab are cheaper per review. If you need structured sentiment and competitive intelligence in one run instead of a separate scrape-then-LLM pipeline, this actor delivers that.

### Use Cases

#### Reputation Monitoring for SMBs

Schedule weekly runs to track TrustScore, sentiment distribution, and complaint themes over time. Detect emerging negative trends before they escalate into a public PR problem. Set `maxReviews: 50` to keep weekly monitoring costs under $0.20/run.

#### Competitive Analysis

Scrape multiple competitor domains in a single run. Compare sentiment scores, top complaint themes, and differentiators across your category. Identify the service gaps your competitors consistently fail to close.

#### NPS Tracking and Trend Analysis

Use "deep" analysis depth to get temporal trend analysis showing whether recent reviews are improving or declining. Correlate sentiment trends with product launches, pricing changes, or support team changes.

#### Alert on 1-Star Reviews via n8n

Use an Apify run-finished webhook or n8n Apify trigger. Every time the actor finishes, fetch the dataset, filter for low-rated reviews, and route them to a Slack `#reputation-alerts` channel or create a customer support ticket automatically.

#### Scheduled Weekly Brand Health Dashboard

Combine with Apify Schedules (cron: `0 9 * * 1`) for a weekly Monday-morning reputation digest across 5-10 competitor brands. Each run produces a timestamped dataset entry you can pipe into a Google Sheets dashboard.

#### Investor Due Diligence

Assess company reputation before investment decisions. Trustpilot reviews provide unfiltered customer sentiment that complements financial metrics. "Deep" analysis mode extracts complaint themes, persona breakdowns, and competitive positioning for board-level reporting.

#### AI Training Data

Export large batches of labeled reviews (star rating + text) as training data for fine-tuning sentiment classifiers or review quality models. `enableAiAnalysis: false` keeps costs at pure per-review scraping.

#### Customer Success Early Warning

Monitor enterprise customer domains on Trustpilot. When a key account's rating drops, surface the complaint themes to your CSM team before a churn event. The rating-drop workflow pattern is documented in the Scheduling and Webhooks section below.

### Cost and Performance

This actor uses **pay-per-event (PPE)** pricing. You only pay for successful results.

| Event | Price | Description |
|-------|-------|-------------|
| `review-scraped` | $0.003 | Per review extracted (not charged when anti-bot blocks the URL) |
| `ai-analysis-completed` | $0.05 | Per business analyzed with AI (only fires when reviews were scraped) |
| `bypass-request` | $0.015 | Per commercial bypass provider call -- charged regardless of success to reflect upstream cost |
| `probe-completed` | $0.0005 | Per probe-mode health-check fetch |

#### Cost Examples

| Scenario | Reviews | AI | Estimated Cost |
|----------|---------|----|----------------|
| Quick reputation check | 50 | 1 | ~$0.20 |
| Standard analysis (100 reviews) | 100 | 1 | ~$0.35 |
| Competitor comparison (3 companies) | 300 | 3 | ~$1.05 |
| Raw data export (no AI) | 500 | 0 | ~$1.50 |
| Weekly brand health (5 competitors x 50) | 250 | 5 | ~$1.00 |
| Blocked run (anti-bot won) | 0 | 0 | $0.00 |

Plus Apify platform compute costs and your LLM provider costs (typically $0.001-0.01 per analysis with OpenRouter/Gemini Flash). **Blocked runs cost nothing on review/AI charges** -- only the minimal compute cost of the retry attempts.

### Input Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `businessUrls` | array | -- | One or more Trustpilot review page URLs (e.g., `https://www.trustpilot.com/review/amazon.com`). |
| `companyDomain` | string | -- | Alternative to full URL. Enter "amazon.com" and the actor builds the Trustpilot URL automatically. |
| `maxReviews` | integer | 50 | Maximum reviews per business (1-500). More reviews improve AI analysis accuracy. |
| `language` | select | English | Language for AI analysis output. Supported: English, Dutch, German, French, Spanish, Italian, Portuguese. |
| `analysisDepth` | select | standard | `quick` (summary + top 3 themes), `standard` (full breakdown + recommendations), or `deep` (trend analysis, personas, keywords). |
| `enableAiAnalysis` | boolean | false | Enable AI-powered review analysis. Requires an LLM API key. |
| `llmProvider` | select | openrouter | AI provider: openrouter (recommended), anthropic, google, openai, or ollama. |
| `llmModel` | string | -- | Override the default model. Leave empty for provider default. |
| `openrouterApiKey` | string | -- | OpenRouter API key. Get one at openrouter.ai/keys. |
| `anthropicApiKey` | string | -- | Anthropic API key. Get one at console.anthropic.com. |
| `googleApiKey` | string | -- | Google AI API key. Get one at aistudio.google.com/app/apikey. |
| `openaiApiKey` | string | -- | OpenAI API key. Get one at platform.openai.com/api-keys. |
| `ollamaBaseUrl` | string | http://localhost:11434 | Base URL for self-hosted Ollama instance. |
| `proxyConfiguration` | object | RESIDENTIAL | Proxy settings. Residential proxy strongly recommended for Trustpilot. |
| `mode` | select | scrape | `scrape` (full run) or `probe` (single health-check fetch, $0.0005). |
| `useProxyBypass` | select | off | Anti-bot bypass provider: `off`, `auto`, `scrapingbee`, or `brightdata-unlocker`. |
| `scrapingBeeApiKey` | string | -- | ScrapingBee API key. Required when useProxyBypass is `scrapingbee` or `auto`. |
| `brightDataApiKey` | string | -- | Bright Data account token. Required together with `brightDataZone`. |
| `brightDataZone` | string | -- | Bright Data unlocker zone name. |

### Quick Start

Analyze a company by domain name -- the simplest possible input:

```json
{
    "companyDomain": "revolut.com",
    "maxReviews": 100,
    "analysisDepth": "standard",
    "enableAiAnalysis": true,
    "openrouterApiKey": "sk-or-..."
}
````

Or use a direct Trustpilot URL without AI:

```json
{
    "businessUrls": ["https://www.trustpilot.com/review/amazon.com"],
    "maxReviews": 200,
    "enableAiAnalysis": false
}
```

### Output Schema

Each business produces a structured JSON object combining profile data, raw reviews, and optional AI analysis:

```json
{
    "url": "https://www.trustpilot.com/review/revolut.com",
    "business_url": "https://www.trustpilot.com/review/revolut.com",
    "business_name": "Revolut",
    "trustpilot_rating": 4.3,
    "trust_score": 4.3,
    "total_reviews_on_trustpilot": 184521,
    "categories": ["Financial Services", "Banking"],
    "reviews_scraped": 100,
    "reviews": [
        {
            "reviewer_name": "John D.",
            "rating": 5,
            "title": "Best banking app ever",
            "text": "Switched from my traditional bank and never looked back...",
            "date": "2024-03-15",
            "review_age_days": 42,
            "has_company_response": true
        }
    ],
    "ai_analysis": {
        "sentiment_score": 7.8,
        "sentiment_breakdown": {
            "positive_pct": 72,
            "neutral_pct": 12,
            "negative_pct": 16
        },
        "praise_themes": [
            {
                "theme": "Easy international transfers",
                "frequency": "Very frequent (40+ mentions)",
                "example_snippet": "sending money abroad is instant"
            }
        ],
        "complaint_themes": [
            {
                "theme": "Account freezes",
                "frequency": "Moderate (15 mentions)",
                "example_snippet": "froze my account without warning"
            }
        ],
        "differentiators": [
            "Multi-currency accounts with real exchange rates",
            "Instant peer-to-peer payments"
        ],
        "recommendations": [
            "Improve communication during account verification holds",
            "Add phone support for premium customers"
        ],
        "summary": "Revolut receives strong praise for its modern app and international transfer capabilities. Key pain points center on account freezes and slow support response times."
    },
    "bypass_used": null,
    "bypass_attempts": 0,
    "bypass_succeeded": false,
    "scraped_at": "2026-04-28T14:30:00Z"
}
```

Field notes:

- `trust_score` is a portfolio-standard alias of `trustpilot_rating` -- same numeric value, 0.0-5.0 scale. Trustpilot removed the standalone TrustScore metric in 2023; both fields are emitted for backward compatibility.
- `review_age_days` -- days since the review was posted, computed at scrape time.
- `has_company_response` -- true when the business has publicly replied to the review.
- `bypass_used`, `bypass_attempts`, `bypass_succeeded` -- telemetry for commercial bypass provider calls. Present on every item so downstream consumers always see a consistent schema.

Deep analysis mode adds `customer_personas`, `temporal_trends`, `keyword_extraction`, and `competitive_positioning` fields. Output is available as JSON, CSV, or Excel.

### Reliability Notes

Trustpilot may block automated access, especially from datacenter IPs. Use Apify residential proxy for production runs, start with probe mode, and treat `block_detected: true` as a clean signal to pause or switch access strategy.

The actor fetches review pages with polite delays, retries temporary failures, and records block/success telemetry so repeated scheduled runs can decide whether to proceed. Optional commercial bypass providers are available for teams that already use ScrapingBee or Bright Data Web Unlocker; every item records whether a bypass provider was used.

AI analysis is optional. When enabled, review text is sent to your chosen provider to produce sentiment, themes, complaint clusters, and recommendations.

### Anti-Bot Bypass (v1.7)

Set `useProxyBypass` to route fetches through a commercial bypass provider when standard residential proxy + fingerprint rotation is not sufficient.

| Mode | Behavior |
|------|----------|
| `off` (default) | Standard access path only. No bypass charges. |
| `auto` | Tries ScrapingBee then Bright Data, picking whichever has credentials configured. |
| `scrapingbee` | Route via ScrapingBee -- requires `scrapingBeeApiKey`. |
| `brightdata-unlocker` | Route via Bright Data Web Unlocker -- requires `brightDataApiKey` + `brightDataZone`. |

Every dataset item carries `bypass_used`, `bypass_attempts`, and `bypass_succeeded` fields so you can audit which provider produced the result.

### Probe Mode

For pipelines that batch hundreds of scrapes, a failed batch is expensive. Set `mode=probe` to run a single lightweight fetch against the first URL, record the outcome to the 7-day rolling success-rate telemetry, and emit a single summary item for $0.0005.

```json
{
    "mode": "probe",
    "businessUrls": ["https://www.trustpilot.com/review/revolut.com"]
}
```

The probe item includes `block_detected`, `bypass_stats`, and `success_rate_summary` so your pipeline can skip the batch when the block rate has collapsed.

### Scheduling and Webhooks

Schedule weekly Trustpilot reputation runs in Apify Console under Schedules to track TrustScore drift and emerging complaint themes for your brand or competitors. Use Apify run-finished webhooks, n8n, or Make to forward new negative reviews into a Slack `#reputation-alerts` channel or create a HubSpot ticket the moment a run completes -- ideal for customer-success teams who need to respond to 1-star reviews within hours, not days.

**Example n8n workflow**: Apify Webhook trigger -> Filter node (rating <= 2) -> Slack message to `#reputation-alerts` with business name, reviewer name, rating, and review text.

### Pair With

Build a multi-platform reputation suite by combining this actor with:

- **[AI Review Analyzer](https://apify.com/harvestlab/review-analyzer)** -- Analyze Google Maps reviews with AI. Combine Trustpilot reviews (online reputation) with Google Maps reviews (local customer sentiment) for a comprehensive reputation audit.
- **[App Store Scraper](https://apify.com/harvestlab/app-store-scraper)** -- Scrape Apple App Store and Google Play reviews. Together with Trustpilot, covers mobile app reputation alongside web service reputation in one pipeline.
- **[Google News Monitor](https://apify.com/harvestlab/news-monitor)** -- Monitor news coverage about a brand alongside its Trustpilot reputation. Correlate media sentiment with customer review sentiment for a complete brand health dashboard.
- **[Contact Extractor](https://apify.com/harvestlab/contact-extractor)** -- Extract contact details from company websites after evaluating their Trustpilot reputation. Build qualified outreach lists of businesses that meet your reputation standards.
- **[Reddit Scraper](https://apify.com/harvestlab/reddit-scraper)** -- Compare Trustpilot reviews with unfiltered Reddit community opinions for a more complete picture of public sentiment.

### Detecting Blocks in Your Pipeline

When anti-bot wins, the item in the dataset looks like:

```json
{
    "url": "https://www.trustpilot.com/review/revolut.com",
    "business_url": "https://www.trustpilot.com/review/revolut.com",
    "block_detected": true,
    "block_reason": "Trustpilot Cloudflare Enterprise 403",
    "error": "Trustpilot blocked ... after retries.",
    "scraped_at": "2026-04-28T10:11:35Z"
}
```

Filter out blocked rows client-side:

```python
items = client.dataset(dataset_id).list_items().items
reviews = [i for i in items if not i.get("block_detected")]
blocked = [i for i in items if i.get("block_detected")]
print(f"{len(reviews)} businesses scraped, {len(blocked)} blocked")
```

You are only charged for successful reviews (`review-scraped` events count per review actually extracted, not per URL attempted). Blocked URLs do not incur review-scraping charges.

### Frequently Asked Questions

#### Do I need to find the Trustpilot URL for a company?

No. Enter the company's domain name (e.g., "amazon.com") in the `companyDomain` field and the actor builds the correct Trustpilot URL automatically.

#### Which LLM provider should I choose?

OpenRouter is recommended for most users because it is significantly cheaper. The default Gemini Flash model costs a fraction of a cent per analysis and produces high-quality results. Choose Anthropic if you specifically need Claude's reasoning. Ollama lets you use self-hosted models (Llama 3.1) for zero API cost.

#### Can I scrape reviews without AI analysis?

Yes. Set `enableAiAnalysis` to `false` to get raw review data only. You are charged only the per-review scraping fee ($0.003) without the AI analysis fee ($0.05). Useful for feeding data into your own analysis pipeline.

#### How many reviews should I scrape for good AI analysis?

50-200 reviews is the sweet spot. Below 50, the AI may not detect reliable patterns. Above 200, accuracy improves marginally while cost increases linearly. For businesses with thousands of reviews, 200 typically captures the full range of sentiment and themes.

#### What is the difference between quick, standard, and deep analysis?

**Quick** gives you a sentiment score, top 3 praise/complaint themes, and a brief summary. **Standard** provides a complete sentiment breakdown, more themes with example snippets, differentiators, and actionable recommendations. **Deep** adds temporal trend analysis, customer persona identification, keyword extraction, and competitive positioning -- best for investor reports and strategic planning.

#### Can I track reputation changes over time?

Yes. Schedule the actor to run weekly or monthly using Apify Schedules. Each run produces a timestamped dataset. Compare sentiment scores and theme distributions across runs to measure how reputation evolves after product launches, PR events, or operational changes.

### Known Limitations

- **ANTI-BOT BLOCKS (EXPERIMENTAL TIER)** -- As of April 2026, Trustpilot's Cloudflare Enterprise wall returns HTTP 403 on the majority of requests from Apify infrastructure, even with RESIDENTIAL proxy. The actor rotates 4 TLS fingerprints + proxy IPs across 5 retries but often cannot get through. When all retries fail, the dataset contains a `{block_detected: true, block_reason, error}` sentinel row per URL. This is a known upstream constraint, not a bug. Use the commercial bypass hook for higher success rates.
- **Requires an LLM API key for AI analysis** -- You need an API key from OpenRouter, Anthropic, Google AI, or OpenAI (or a self-hosted Ollama instance). LLM usage is billed separately by your chosen provider.
- **Maximum 500 reviews per business** -- Trustpilot caps public review pagination at 500 reviews. The actor respects this cap.
- **Trustpilot page required** -- The company must have a Trustpilot profile. The actor reports an error if no profile is found.
- **Review text availability** -- Some Trustpilot reviews contain only a star rating without text. These are included in the data but contribute less to AI theme analysis.

### Legal and Compliance

This actor scrapes publicly available data. By using this actor, you agree to the following:

- **Your responsibility** -- You are solely responsible for ensuring your use complies with all applicable laws, regulations, and the target website's terms of service. This includes GDPR (EU), CCPA (California), and other data protection laws in your jurisdiction.
- **Trustpilot terms** -- Trustpilot's terms of service govern how their data may be used. Review <https://www.trustpilot.com/legal/terms-for-businesses> before use, particularly for commercial applications.
- **Personal data notice** -- Reviews contain reviewer names and profile data which may constitute personal data under GDPR. Ensure you have a lawful basis (such as legitimate interest) for processing. Exercise caution when storing or republishing review data. Implement data retention policies and honor deletion requests where applicable.
- **No legal advice** -- This actor does not constitute legal advice. Consult a qualified attorney if you have questions about the legality of your specific use case.
- **Intended use** -- This actor is designed for legitimate business purposes such as market research, competitive analysis, and academic research using publicly accessible data.
- **Rate limiting** -- This actor implements polite crawling practices including request delays (1.0-2.5s jitter between pages) and retry backoff to minimize impact on target servers.
- **No warranty** -- This actor is provided "as is" without warranty. Data accuracy depends on the target website's content and structure.

# Actor input Schema

## `businessUrls` (type: `array`):

One or more Trustpilot business review page URLs (e.g. https://www.trustpilot.com/review/amazon.com, https://www.trustpilot.com/review/revolut.com, https://uk.trustpilot.com/review/booking.com). Each URL points to a company's reviews page on Trustpilot.

## `companyDomain` (type: `string`):

Shortcut for Business URLs — enter just the company domain (e.g. 'amazon.com', 'revolut.com', 'booking.com') and the actor builds the Trustpilot URL automatically.

## `businessUrl` (type: `string`):

CLI alias for businessUrls (single URL). Hidden from Console form.

## `url` (type: `string`):

CLI alias for businessUrls (single URL). Hidden from Console form.

## `businessDomain` (type: `string`):

CLI alias for companyDomain. Hidden from Console form.

## `domain` (type: `string`):

CLI alias for companyDomain. Hidden from Console form.

## `maxReviews` (type: `integer`):

Maximum number of reviews to scrape per business (1-500). Typical values: 50 (quick sample), 100 (balanced), 300 (deep analysis). More reviews = better AI analysis but higher runtime and cost.

## `maxItems` (type: `integer`):

CLI alias for maxReviews. Hidden from Console form.

## `language` (type: `string`):

Language for the AI analysis output (reviews themselves can be in any language — the AI translates as needed).

## `analysisDepth` (type: `string`):

How detailed the AI analysis should be. Quick = executive summary only (fastest, cheapest). Standard = themes + sentiment breakdown. Deep = full competitive intelligence report with recommendations.

## `enableAiAnalysis` (type: `boolean`):

Generate an AI-powered reputation report with sentiment scoring, top complaints, and competitive insights from the collected reviews. Requires an LLM API key for your chosen provider.

## `llmProvider` (type: `string`):

Which LLM provider to use for AI analysis. 'openrouter' supports many cheap models (Gemini Flash, Llama, etc.), 'anthropic' uses Claude directly.

## `llmModel` (type: `string`):

Specific model to use. Leave empty for the provider default (google/gemini-2.0-flash-001 for OpenRouter, claude-sonnet-4-20250514 for Anthropic, gemini-2.0-flash for Google AI, gpt-4o-mini for OpenAI, llama3.1 for Ollama).

## `openrouterApiKey` (type: `string`):

OpenRouter API key for AI analysis. Set OPENROUTER\_API\_KEY env var OR openrouterApiKey input. Get one at https://openrouter.ai/keys.

## `anthropicApiKey` (type: `string`):

Anthropic API key for AI analysis. Set ANTHROPIC\_API\_KEY env var OR anthropicApiKey input. Get one at https://console.anthropic.com/settings/keys.

## `googleApiKey` (type: `string`):

Google AI API key for Gemini analysis. Set GOOGLE\_API\_KEY env var OR googleApiKey input. Get one at https://aistudio.google.com/app/apikey.

## `openaiApiKey` (type: `string`):

OpenAI API key for AI analysis. Set OPENAI\_API\_KEY env var OR openaiApiKey input. Get one at https://platform.openai.com/api-keys.

## `ollamaBaseUrl` (type: `string`):

Ollama base URL for self-hosted analysis. Set ollamaBaseUrl input or run Ollama locally. Install at https://ollama.com/download. Default: http://localhost:11434

## `proxyConfiguration` (type: `object`):

Proxy settings. Trustpilot aggressively blocks datacenter IPs with 403 errors - Apify residential proxy is strongly recommended.

## `mode` (type: `string`):

'scrape' (default) runs the full scrape + AI pipeline. 'probe' performs a single lightweight health-check fetch against the first URL and records the outcome to the 7-day rolling success-rate telemetry - useful for pipelines that want to pre-check the Cloudflare wall state before paying for a batch. Probe runs are charged at $0.0005 via the probe-completed event.

## `useProxyBypass` (type: `string`):

Route Trustpilot fetches through a commercial anti-bot bypass service. 'auto' picks the first provider with an API key configured. 'off' (default) uses only the built-in curl\_cffi fingerprint rotation. NOTE: providers are schema-stable placeholders - ScrapingBee and Bright Data Web Unlocker execute real API calls when credentials are present; apify-scraping is reserved for a future iteration.

## `scrapingBeeApiKey` (type: `string`):

API key for ScrapingBee. Get one at app.scrapingbee.com/account. Only needed if useProxyBypass includes 'scrapingbee' or 'auto'.

## `brightDataApiKey` (type: `string`):

Account token for Bright Data Web Unlocker. Get one at brightdata.com. Required together with brightDataZone. Only used when useProxyBypass includes 'brightdata-unlocker' or 'auto'.

## `brightDataZone` (type: `string`):

Bright Data unlocker zone name (set up in the Bright Data dashboard). Required together with brightDataApiKey.

## Actor input object example

```json
{
  "businessUrls": [
    "https://www.trustpilot.com/review/amazon.com"
  ],
  "maxReviews": 50,
  "language": "English",
  "analysisDepth": "standard",
  "enableAiAnalysis": false,
  "llmProvider": "openrouter",
  "ollamaBaseUrl": "http://localhost:11434",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "mode": "scrape",
  "useProxyBypass": "off"
}
```

# Actor output Schema

## `datasetOutput` (type: `string`):

Dataset containing scraped Trustpilot review records plus probe, block, AI, and telemetry summary records.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "businessUrls": [
        "https://www.trustpilot.com/review/amazon.com"
    ],
    "ollamaBaseUrl": "http://localhost:11434",
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("harvestlab/trustpilot-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "businessUrls": ["https://www.trustpilot.com/review/amazon.com"],
    "ollamaBaseUrl": "http://localhost:11434",
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("harvestlab/trustpilot-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "businessUrls": [
    "https://www.trustpilot.com/review/amazon.com"
  ],
  "ollamaBaseUrl": "http://localhost:11434",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call harvestlab/trustpilot-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=harvestlab/trustpilot-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Trustpilot Review Scraper - AI Reputation Monitor",
        "description": "Trustpilot review scraper for reputation monitoring: reviews, ratings, owner replies, AI sentiment/topics, rating-drop tracking, probe-mode block diagnostics, and user-controlled bypass hooks. Experimental anti-bot tier. x402-ready PPE at $0.003/review.",
        "version": "1.7",
        "x-build-id": "kNYK5TgYnNU52dDE3"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/harvestlab~trustpilot-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-harvestlab-trustpilot-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/harvestlab~trustpilot-scraper/runs": {
            "post": {
                "operationId": "runs-sync-harvestlab-trustpilot-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/harvestlab~trustpilot-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-harvestlab-trustpilot-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "businessUrls": {
                        "title": "Trustpilot Business URLs",
                        "type": "array",
                        "description": "One or more Trustpilot business review page URLs (e.g. https://www.trustpilot.com/review/amazon.com, https://www.trustpilot.com/review/revolut.com, https://uk.trustpilot.com/review/booking.com). Each URL points to a company's reviews page on Trustpilot.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "companyDomain": {
                        "title": "Company Domain (alternative)",
                        "type": "string",
                        "description": "Shortcut for Business URLs — enter just the company domain (e.g. 'amazon.com', 'revolut.com', 'booking.com') and the actor builds the Trustpilot URL automatically."
                    },
                    "businessUrl": {
                        "title": "Business URL (CLI alias)",
                        "type": "string",
                        "description": "CLI alias for businessUrls (single URL). Hidden from Console form."
                    },
                    "url": {
                        "title": "URL (CLI alias)",
                        "type": "string",
                        "description": "CLI alias for businessUrls (single URL). Hidden from Console form."
                    },
                    "businessDomain": {
                        "title": "Business Domain (CLI alias)",
                        "type": "string",
                        "description": "CLI alias for companyDomain. Hidden from Console form."
                    },
                    "domain": {
                        "title": "Domain (CLI alias)",
                        "type": "string",
                        "description": "CLI alias for companyDomain. Hidden from Console form."
                    },
                    "maxReviews": {
                        "title": "Max Reviews per Business",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of reviews to scrape per business (1-500). Typical values: 50 (quick sample), 100 (balanced), 300 (deep analysis). More reviews = better AI analysis but higher runtime and cost.",
                        "default": 50
                    },
                    "maxItems": {
                        "title": "Max Items (CLI alias)",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "CLI alias for maxReviews. Hidden from Console form."
                    },
                    "language": {
                        "title": "Analysis Language",
                        "enum": [
                            "English",
                            "Dutch",
                            "German",
                            "French",
                            "Spanish",
                            "Italian",
                            "Portuguese"
                        ],
                        "type": "string",
                        "description": "Language for the AI analysis output (reviews themselves can be in any language — the AI translates as needed).",
                        "default": "English"
                    },
                    "analysisDepth": {
                        "title": "Analysis Depth",
                        "enum": [
                            "quick",
                            "standard",
                            "deep"
                        ],
                        "type": "string",
                        "description": "How detailed the AI analysis should be. Quick = executive summary only (fastest, cheapest). Standard = themes + sentiment breakdown. Deep = full competitive intelligence report with recommendations.",
                        "default": "standard"
                    },
                    "enableAiAnalysis": {
                        "title": "Enable AI Analysis",
                        "type": "boolean",
                        "description": "Generate an AI-powered reputation report with sentiment scoring, top complaints, and competitive insights from the collected reviews. Requires an LLM API key for your chosen provider.",
                        "default": false
                    },
                    "llmProvider": {
                        "title": "LLM Provider",
                        "enum": [
                            "openrouter",
                            "anthropic",
                            "google",
                            "openai",
                            "ollama"
                        ],
                        "type": "string",
                        "description": "Which LLM provider to use for AI analysis. 'openrouter' supports many cheap models (Gemini Flash, Llama, etc.), 'anthropic' uses Claude directly.",
                        "default": "openrouter"
                    },
                    "llmModel": {
                        "title": "LLM Model (optional)",
                        "type": "string",
                        "description": "Specific model to use. Leave empty for the provider default (google/gemini-2.0-flash-001 for OpenRouter, claude-sonnet-4-20250514 for Anthropic, gemini-2.0-flash for Google AI, gpt-4o-mini for OpenAI, llama3.1 for Ollama)."
                    },
                    "openrouterApiKey": {
                        "title": "OpenRouter API Key",
                        "type": "string",
                        "description": "OpenRouter API key for AI analysis. Set OPENROUTER_API_KEY env var OR openrouterApiKey input. Get one at https://openrouter.ai/keys."
                    },
                    "anthropicApiKey": {
                        "title": "Anthropic API Key",
                        "type": "string",
                        "description": "Anthropic API key for AI analysis. Set ANTHROPIC_API_KEY env var OR anthropicApiKey input. Get one at https://console.anthropic.com/settings/keys."
                    },
                    "googleApiKey": {
                        "title": "Google AI API Key",
                        "type": "string",
                        "description": "Google AI API key for Gemini analysis. Set GOOGLE_API_KEY env var OR googleApiKey input. Get one at https://aistudio.google.com/app/apikey."
                    },
                    "openaiApiKey": {
                        "title": "OpenAI API Key",
                        "type": "string",
                        "description": "OpenAI API key for AI analysis. Set OPENAI_API_KEY env var OR openaiApiKey input. Get one at https://platform.openai.com/api-keys."
                    },
                    "ollamaBaseUrl": {
                        "title": "Ollama Base URL",
                        "type": "string",
                        "description": "Ollama base URL for self-hosted analysis. Set ollamaBaseUrl input or run Ollama locally. Install at https://ollama.com/download. Default: http://localhost:11434"
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings. Trustpilot aggressively blocks datacenter IPs with 403 errors - Apify residential proxy is strongly recommended."
                    },
                    "mode": {
                        "title": "Run Mode",
                        "enum": [
                            "scrape",
                            "probe"
                        ],
                        "type": "string",
                        "description": "'scrape' (default) runs the full scrape + AI pipeline. 'probe' performs a single lightweight health-check fetch against the first URL and records the outcome to the 7-day rolling success-rate telemetry - useful for pipelines that want to pre-check the Cloudflare wall state before paying for a batch. Probe runs are charged at $0.0005 via the probe-completed event.",
                        "default": "scrape"
                    },
                    "useProxyBypass": {
                        "title": "Anti-Bot Bypass Provider",
                        "enum": [
                            "off",
                            "auto",
                            "scrapingbee",
                            "brightdata-unlocker",
                            "apify-scraping"
                        ],
                        "type": "string",
                        "description": "Route Trustpilot fetches through a commercial anti-bot bypass service. 'auto' picks the first provider with an API key configured. 'off' (default) uses only the built-in curl_cffi fingerprint rotation. NOTE: providers are schema-stable placeholders - ScrapingBee and Bright Data Web Unlocker execute real API calls when credentials are present; apify-scraping is reserved for a future iteration.",
                        "default": "off"
                    },
                    "scrapingBeeApiKey": {
                        "title": "ScrapingBee API Key",
                        "type": "string",
                        "description": "API key for ScrapingBee. Get one at app.scrapingbee.com/account. Only needed if useProxyBypass includes 'scrapingbee' or 'auto'."
                    },
                    "brightDataApiKey": {
                        "title": "Bright Data API Token",
                        "type": "string",
                        "description": "Account token for Bright Data Web Unlocker. Get one at brightdata.com. Required together with brightDataZone. Only used when useProxyBypass includes 'brightdata-unlocker' or 'auto'."
                    },
                    "brightDataZone": {
                        "title": "Bright Data Zone",
                        "type": "string",
                        "description": "Bright Data unlocker zone name (set up in the Bright Data dashboard). Required together with brightDataApiKey."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
