Review & Rating Extractor (aggregate + individual)
Under maintenancePricing
Pay per usage
Review & Rating Extractor (aggregate + individual)
Under maintenanceExtract the aggregate rating (value, count, best) AND individual reviews (author, rating, date, title, body) from public product, business, and article pages via JSON-LD Review and AggregateRating. HTML-only, fast, structured output with clean ok/error parity.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Tommy G
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Reviews & Ratings Extractor (Apify Actor)
Give it any public page URL and get back clean, normalized review and rating data — the
item being reviewed, its aggregate rating (value/best/count), and the individual reviews
on the page (author, rating, body, date) — pulled from schema.org Review, AggregateRating,
and review-bearing types in JSON-LD and microdata. HTML-only (no headless browser) so it's fast
and cheap. Ideal for reputation monitoring, review datasets, and product/listing research.
What it extracts
For each page it returns one flat record with:
- item_name, item_type (what the reviews are about, e.g. Product / LocalBusiness / Recipe)
- aggregate_rating, aggregate_best, aggregate_count (the summary star rating)
- reviews[] — individual reviews found on the page, reviews_extracted (how many)
Plus control keys present on every row (ok and error alike, for clean buyer tables):
status, requested_url, final_url, http_status, redirected, found, complete, page_type, source, render_required, fields_found, error, extracted_atInput
{ "startUrls": [{ "url": "https://example.com/product/123" }], "maxConcurrency": 5, "maxPages": 100 }
maxPages capped at 200, maxConcurrency at 20 (cost guard).
Output — one STABLE record per URL (ok and error rows share the shape)
{"status": "ok","requested_url": "https://example.com/product/123","final_url": "https://example.com/product/123","http_status": 200,"found": true,"complete": true,"page_type": "review","source": "json-ld","item_name": "Acme Widget","item_type": "Product","aggregate_rating": 4.5,"aggregate_best": 5,"aggregate_count": 231,"reviews": [{ "author": "Sam", "rating": 5, "body": "Works great.", "date": "2026-04-10" },{ "author": "Lee", "rating": 4, "body": "Good value.", "date": "2026-04-02" }],"reviews_extracted": 2,"fields_found": ["item_name", "aggregate_rating", "aggregate_count", "reviews"],"extracted_at": "2026-05-29T..."}
found:false means no review/rating markup was present (e.g. a page with no schema.org review
data, or a JS-rendered review widget). Failed fetches return the same keys with
status:"error" + error.
Use cases
- Reputation monitoring — track aggregate rating and review counts for your listings over time.
- Review datasets — collect individual reviews across many product/service pages for analysis.
- Competitor / market research — compare rating distributions and review volume across pages.
Notes / safety
- Reads only public schema.org review/rating markup — facts-only, no PII beyond what the page itself publicly publishes; no raw page bodies stored.
- SSRF-guarded (scheme + private/metadata IP block + redirect re-check), robots-respecting,
rate-limited, cost-capped — all via the shared
src/lib/actor_runner.js. - HTML-only: client-rendered review widgets that inject JSON via JS return
found:false(no server-side markup to read). Core logic insrc/extract.js(pure, unit-tested).
Run locally / test
npm installnpm test # unit tests on the pure extractor (node:test)
Publish to Apify (account-holder's step)
$npm install -g apify-cli && apify login && apify push
Keep it free initially; enable pricing later via the adult account-holder once it shows repeat organic usage and clears a margin gate.