Capterra reviews (Works and Fast)
Pricing
from $4.00 / 1,000 results
Capterra reviews (Works and Fast)
Extract verified Capterra software reviews at scale. Bypasses Cloudflare, handles pagination & filters, returns clean JSON/CSV/Excel with reviewer details, ratings, pros/cons, and full text. Pay-per-result pricing. Perfect for competitive intel, lead gen, and sentiment analysis.
Pricing
from $4.00 / 1,000 results
Rating
0.0
(0)
Developer
Archit Jain
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Capterra reviews scraper (Apify Actor)
Scrapes Capterra product review pages (including behind Cloudflare) using SeleniumBase UC, Playwright over CDP, and a residential proxy. Each review is pushed to the default dataset; a summary is stored under the key OUTPUT in the default key-value store. The same run still writes a timestamped JSON file under DATA_DIR (see .env.example).
Requirements
- Proxy: set
PROXY_USERandPROXY_PASSWORD(orPROXY_PASS), or enable Apify Proxy on the Actor soAPIFY_PROXY_PASSWORDis injected (withPROXY_PROVIDER=apify, an emptyPROXY_USERdefaults togroups-RESIDENTIALin code). - Memory: browser automation is heavy; this repo sets default 4096 MB / min 2048 MB in
.actor/actor.json(adjust if runs OOM).
Input
| Field | Description |
|---|---|
reviewsUrl | Capterra reviews tab URLs as a JSON array of strings (see .actor/actor.json). Each URL must be https://www.capterra.com/p/{numericId}/{slug}/reviews/ (optional trailing slash; no query string or #fragment). At most four URLs per run. The scraper still accepts a single string or legacy { "url": "..." } rows if present in saved input. All URLs run in one browser session, in order. |
maxReviews | Cap per URL (default 50). Example: 10 with two URLs → up to 20 reviews in one run. |
dataDir | Optional DATA_DIR for JSON + CF screenshots. |
Default local input for apify run: storage/key_value_stores/default/INPUT.json.
Run
Use a virtualenv inside this repo so dependencies match Docker (requirements.txt):
cd /path/to/capterra-scraperpython3 -m venv .venv.venv/bin/pip install -r requirements.txt
Apify CLI uses $VIRTUAL_ENV/bin/python3 when set, otherwise ./.venv/bin/python3, otherwise which python3 (so a parent venv left on PATH still wins). Use the wrapper so only this repo’s .venv is used (it must exist; see above):
$./scripts/apify-run.sh
If nothing else is activated and ./.venv exists, plain apify run is usually enough. Otherwise: env -u VIRTUAL_ENV PATH="$PWD/.venv/bin:$PATH" apify run
Without Apify (Docker / plain Python):
$python src/main.py
After apify run, results are under storage/: default dataset items in storage/datasets/<id>/ (one JSON per push_data row), and OUTPUT in storage/key_value_stores/default/OUTPUT.json (reviewCount, reviewsUrls, and reviewsUrl as the first URL for backward compatibility). In Apify Console, the same appears on the run’s Dataset and Key-value store tabs.
Docker image entrypoint is unchanged: Xvfb + python src/main.py.
Deploy
From this directory (with Apify CLI logged in):
$apify push
Configure Actor → Settings → Environment variables (or secrets) for proxy credentials; see .env.example for variable names.
Troubleshooting apify run
ModuleNotFoundError: No module named 'apify'— The CLI uses$VIRTUAL_ENV/bin/python3when your shell has any venv activated, which may be a parent repo (see theRun:line). Fix:./scripts/apify-run.sh, ordeactivatethenapify run, orenv -u VIRTUAL_ENV apify runafter installing into./.venvwithpip install -r requirements.txt.TypeError: cannot specify both default and default_factorywhen importingapify— Your environment has Pydantic ≥ 2.12, which breaks crawlee (pulled in byapify2.x). Run:pip install 'pydantic>=2.10.3,<2.12'(already constrained inrequirements.txt).browserforge.downloadhas no attributeDATA_FILES— Pinbrowserforge==1.2.3(seerequirements.txt) and reinstall:.venv/bin/pip install -r requirements.txt.