TripAdvisor Reviews [Only $0.45💰] Scraper
Pricing
from $0.45 / 1,000 results
TripAdvisor Reviews [Only $0.45💰] Scraper
💰$0.45 per 1,000 Scrape TripAdvisor hotel reviews: title, rating, language, text, dates, owner response, photos, sub-ratings, and optional reviewer profiles. Each review is enriched with place metadata (rating, address, geo, website, histogram). Filter by rating, language, date and per-place limit
Pricing
from $0.45 / 1,000 results
Rating
0.0
(0)
Developer
Muhamed Didovic
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
20 hours ago
Last modified
Categories
Share
TripAdvisor Reviews Scraper
Turn TripAdvisor hotel pages into structured review datasets. Pull every public review for a property — title, rating, language, text, travel date, owner response, photos, sub‑ratings — already enriched with the host place's full metadata (address, geo, ranking, ratingHistogram). One run, one clean dataset.
How it works

✨ Why use this scraper?
Manually opening hotel pages and copying reviews? Stitching together separate "reviews" and "place details" scrapes? Getting blocked by DataDome the moment you scale?
- 🏨 Reviews + place metadata in the same row. Every review already carries
placeInfo(rating, address, lat/lng, ranking, histogram). No follow‑up enrichment. - 🎯 Server‑side filters wired through TripAdvisor's GraphQL. Star ratings, languages, per‑place limits and dates are pushed down to the API — you get back what you asked for.
- 📅 Absolute or relative date cutoff.
"2026-01-01"or"22 days","3 weeks","6 months","1 year"— all valid forlastReviewDate. - 👤 Optional reviewer profiles. Flip
scrapeReviewerInfoto switch from the lean review‑centric output to a reviewer‑centric output with username, hometown, contributions, avatar, profile link. - 🛡️ Hardened anti‑bot path. Mobile‑Safari UA fallback through
undicifor HTML, real browser fingerprinting viaImpitHttpClientfor the GraphQL endpoint, single‑shot DataDome detection. - 📑 Per‑location failure dataset. Skipped or blocked hotels land in a side dataset (
tripadvisor-failures) instead of getting buried in logs. - ⚡ Parallel pagination. Each hotel pages through GraphQL in concurrent batches (default 3), respecting your global and per‑place caps.
🎯 Use cases
| Team | What they build |
|---|---|
| Hotel ops | Daily review monitoring + owner‑response SLA tracking |
| Reputation managers | Multi‑property reputation dashboards with ratingHistogram drift over time |
| Market analysts | Competitive benchmarks across cities or chains using placeInfo.rating + numberOfReviews |
| Content / NLP teams | Multilingual review corpora for sentiment and topic models, filtered by language and rating |
| Travel media | Curated "best of" articles backed by recent verified reviews |
| Data teams | One‑shot dataset exports for BI, lake or warehouse ingestion (JSON, CSV, Excel) |
🔧 How it works (pipeline)
- Detect location ID from each
*_Review-...URL (-d{id}-) — works for hotels, restaurants and attractions. - Fetch the place HTML with a mobile‑Safari User‑Agent. Falls back to a direct
undicifetch when DataDome blocks the desktop fingerprint, and to URL‑derivedplaceInfowhen even that is blocked (reviews still come through over GraphQL). - Extract
placeInfofrom the page's JSON‑LD + meta description (rating, review count, address, geo, ranking position). - Page through GraphQL reviews at
/data/graphql/idsin concurrent batches, withreviewRatings/reviewsLanguages/maxItemsPerQuerypushed into the filter payload. - Apply
lastReviewDateclient‑side after each page, and break early once an entire page is older than the cutoff. - Map and push each review enriched with
placeInfoto the default dataset; skipped or blocked places go totripadvisor-failures.
📥 Supported inputs
Currently supported start URLs (mix and match in a single run):
| Pattern | Example |
|---|---|
Hotel_Review-g{geoId}-d{locationId}-Reviews-{slug}.html | https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html |
Restaurant_Review-g{geoId}-d{locationId}-Reviews-{slug}.html | https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html |
Attraction_Review-g{geoId}-d{locationId}-Reviews-{slug}.html | https://www.tripadvisor.com/Attraction_Review-g60763-d105123-Reviews-Statue_of_Liberty-New_York_City_New_York.html |
All three share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL, EATERY, ATTRACTION).
Not currently supported:
- Search / list pages (
FindRestaurants?...,Hotels-g…,Attractions-g…). tripadvisor.co.*country domains (use.com).- Non‑TripAdvisor hosts.
⚙️ Input parameters
Required
| Parameter | Type | Default | Description |
|---|---|---|---|
startUrls | array of { url } | — | One or more TripAdvisor place URLs (Hotel_Review-…, Restaurant_Review-…, or Attraction_Review-…). Run them in parallel up to maxConcurrency. |
Filters
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItemsPerQuery | integer | 50 | Max reviews to fetch per place. 0 = unlimited (paginate to the end). |
reviewRatings | array | [] (all) | Star ratings to keep. Values: "1", "2", "3", "4", "5", or "ALL_REVIEW_RATINGS". Pushed down into the GraphQL filter payload. |
reviewsLanguages | array | [] (all) | ISO 639‑1 codes (e.g. ["en", "de", "fr"]) or "ALL_REVIEW_LANGUAGES". Pushed down into the GraphQL filter payload. |
lastReviewDate | string | — | Skip reviews published before this date. Accepts an absolute date YYYY-MM-DD or a relative duration: "22 days", "2 weeks", "3 months", "1 year" (singular or plural). |
scrapeReviewerInfo | boolean | true | When true, the user object on each review is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). When false, user is null. The rest of the review fields (id, lang, helpfulVotes, tripType, subratings, photos, …) are emitted in both modes. |
Advanced
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 10000 | Hard ceiling on the total number of items written to the dataset across all places. 0 = no global cap. |
maxConcurrency | integer | 100 | Max start URLs (hotels) processed concurrently. |
minConcurrency | integer | 1 | Crawler floor. |
maxRequestRetries | integer | 15 (range 0–50) | Retries per request before giving up. Lower = surface failures fast, higher = absorb transient anti‑bot blocks. |
proxy | object | { useApifyProxy: true, apifyProxyGroups: ["RESIDENTIAL"] } | Standard Apify proxy block. Apify Residential is strongly recommended for TripAdvisor. |
📊 Output overview
Each dataset row is one review enriched with the host place's placeInfo. The full review schema is emitted in both scrapeReviewerInfo modes — the only difference is whether the user object is populated (true) or set to null (false).
Review schema (both modes)
{"id": "1003456789","url": "https://www.tripadvisor.com/ShowUserReviews-g60763-d208453-r1003456789-Hilton_New_York_Times_Square-New_York_City_New_York.html","title": "Great location, friendly staff","lang": "en","language": "en","originalLanguage": "en","locationId": "208453","publishedDate": "2026-03-14","publishedPlatform": "OTHER","rating": 5,"helpfulVotes": 2,"text": "We stayed three nights and...","travelDate": "2026-03","stayDate": "2026-03-14","tripType": "COUPLES","user": null,"ownerResponse": {"id": "987654","text": "Thank you for staying with us...","lang": "en","publishedDate": "2026-03-16","responder": "Hilton Times Square","connectionToSubject": "Manager"},"subratings": [{ "name": "Service", "value": 5 },{ "name": "Cleanliness", "value": 5 },{ "name": "Value", "value": 4 },{ "name": "Location", "value": 5 },{ "name": "Rooms", "value": 4 },{ "name": "Sleep Quality", "value": 5 }],"photos": [{ "id": "812340000", "width": 4032, "height": 3024, "image": "https://media-cdn.tripadvisor.com/media/photo-o/30/68/0c/00/lobby.jpg" }],"placeInfo": {"id": "208453","name": "Hilton New York Times Square","rating": 4.0,"numberOfReviews": 8944,"locationString": "New York City, New York","latitude": 40.756,"longitude": -73.989,"webUrl": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html","website": "https://www.hilton.com/...","address": "234 W 42nd St, New York City, NY 10036","addressObj": {"street1": "234 W 42nd St","city": "New York City","state": "NY","postalcode": "10036","country": "United States"},"ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }}}
With scrapeReviewerInfo: true the only field that changes is user — it is replaced with the reviewer's profile object:
{"user": {"userId": "ABCDEF12345","name": "Traveler 123","contributions": { "totalContributions": 42 },"username": "traveler123","userLocation": { "shortName": "Berlin", "name": "Berlin, Germany", "id": "187323" },"avatar": { "id": "5567890", "width": 200, "height": 200, "image": "https://media-cdn.tripadvisor.com/media/photo-l/...jpg" },"link": "www.tripadvisor.com/Profile/traveler123"}}
🗂️ Output fields
Review core
All fields below are emitted in both scrapeReviewerInfo modes.
| Field | Type | Description |
|---|---|---|
id | string | TripAdvisor review ID |
url | string | Permalink to the review on TripAdvisor |
title | string | Review headline |
text | string | Review body |
rating | integer (1–5) | Stars left by the reviewer |
lang | string (ISO 639‑1) | Source language (preserves originalLanguage for machine‑translated reviews) |
language | string (ISO 639‑1) | Current API language (may be the translated language) |
originalLanguage | string (ISO 639‑1) | Source language before machine translation |
publishedDate | string (ISO date) | When the review was published |
publishedPlatform | string | Source platform (e.g. OTHER, MOBILE) |
helpfulVotes | integer | Helpful‑vote count |
travelDate | string (YYYY-MM) | Month/year of the stay |
stayDate | string (YYYY-MM-DD) | Full stay date |
tripType | string | null | COUPLES, FAMILY, BUSINESS, SOLO, FRIENDS, … |
locationId | string | TripAdvisor location ID for the place |
Owner response
| Field | Type | Description |
|---|---|---|
ownerResponse.id | string | Response ID |
ownerResponse.text | string | Response body |
ownerResponse.lang | string | Response language |
ownerResponse.publishedDate | string | Response date |
ownerResponse.responder | string | Display name (e.g. property name or manager) |
ownerResponse.connectionToSubject | string | Manager, Owner, etc. |
Subratings
subratings is an array of { name, value }:
| Field | Type | Description |
|---|---|---|
subratings[].name | string | Service, Cleanliness, Value, Location, Rooms, Sleep Quality, … |
subratings[].value | integer (1–5) | Per‑aspect rating |
Photos
photos is an array of { id, width, height, image }:
| Field | Type | Description |
|---|---|---|
photos[].id | string | TripAdvisor photo ID |
photos[].width / height | integer | Native pixel dimensions |
photos[].image | string | Bare CDN URL (no ?w=... query) |
Reviewer (only when scrapeReviewerInfo: true)
| Field | Type | Description |
|---|---|---|
user.userId | string | null | Internal TripAdvisor user ID |
user.name | string | null | Display name |
user.username | string | null | URL handle |
user.contributions.totalContributions | integer | Lifetime contribution count |
user.userLocation | object | null | { shortName, name, id } resolved from hometown |
user.avatar | object | null | { id, width, height, image } |
user.link | string | null | Profile link without scheme (www.tripadvisor.com/Profile/...) |
Place metadata (placeInfo)
| Field | Type | Description |
|---|---|---|
placeInfo.id | string | Numeric location ID |
placeInfo.name | string | Hotel name |
placeInfo.rating | number | Aggregate 1–5 rating |
placeInfo.numberOfReviews | integer | Total review count on TripAdvisor |
placeInfo.locationString | string | Human‑readable city/state/country |
placeInfo.latitude / longitude | number | Geo coordinates |
placeInfo.address | string | Single‑line address |
placeInfo.addressObj | object | { street1, street2, city, state, postalcode, country } |
placeInfo.website | string | Hotel's official website |
placeInfo.webUrl | string | TripAdvisor URL |
placeInfo.ratingHistogram | object | { count1, count2, count3, count4, count5 } |
🚀 Examples
Single hotel, defaults
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }],"maxItemsPerQuery": 50}
Multiple hotels, English‑only, last 30 days
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d93589-Reviews-The_Manhattan_at_Times_Square_Hotel-New_York_City_New_York.html" }],"maxItemsPerQuery": 100,"reviewsLanguages": ["en"],"lastReviewDate": "30 days"}
Recent 3‑star reviews with full reviewer profiles
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }],"maxItemsPerQuery": 50,"reviewRatings": ["3"],"reviewsLanguages": ["en"],"lastReviewDate": "2026-01-01","scrapeReviewerInfo": true}
Mix hotels and restaurants in one run
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },{ "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }],"maxItemsPerQuery": 50,"reviewsLanguages": ["en"]}
💻 Integrations
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("muhamed-didovic/apify-tripadvisor").call(run_input={"startUrls": [{"url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html"},],"maxItemsPerQuery": 50,"reviewsLanguages": ["en"],"lastReviewDate": "30 days",})for review in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{review['rating']}★ {review['title']} ({review['lang']})")print(f" place: {review['placeInfo']['name']} rating: {review['placeInfo']['rating']}")
JavaScript / Node
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('muhamed-didovic/apify-tripadvisor').call({startUrls: [{ url: 'https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html' },],maxItemsPerQuery: 50,scrapeReviewerInfo: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach((r) => {console.log(`${r.rating}★ ${r.title} by ${r.user?.username ?? 'anon'}`);});
📈 Performance
| Metric | Value |
|---|---|
| Reviews per hotel page (GraphQL) | 10 |
| Concurrent review pages per hotel | up to 3 (auto‑capped to 1 when many hotels run in parallel) |
| Anti‑bot path | Mobile‑Safari UA + ImpitHttpClient GraphQL fingerprint, single‑shot DataDome detection |
| Failure visibility | Per‑location rows in tripadvisor-failures named dataset |
| Output formats | JSON, CSV, Excel, RSS, XML, HTML (via Apify dataset views) |
💡 Tips for best results
- Start with one place and
maxItemsPerQuery: 10to validate the schema. Works the same for hotels, restaurants and attractions — confirm the fields you care about land in the dataset before launching big runs. - Use server‑side filters.
reviewRatingsandreviewsLanguagesare pushed into the GraphQL payload — much cheaper than fetching everything and filtering downstream. - Prefer relative dates over absolute when running on a schedule.
"30 days"always means "last 30 days from now"; an absolute2026-01-01will silently widen as time passes. - Lower
maxRequestRetriesfor fast feedback. The default15absorbs anti‑bot blocks but can hide a misconfigured proxy. Try3while iterating, raise it for production. - Toggle
scrapeReviewerInfoonly when you need profiles. All review fields are kept in both modes; the only thing that changes is whetheruseris populated. Disabling it makes the dataset smaller and the run faster. - Check the
tripadvisor-failuresdataset after each run. Empty = perfect run. Rows = location‑level issues you can replay.
❓ FAQ
Which TripAdvisor URLs are supported?
Hotel_Review-…, Restaurant_Review-…, and Attraction_Review-… URLs on www.tripadvisor.com (anything that follows the …-g{geoId}-d{locationId}-Reviews-… slug pattern). They share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL / EATERY / ATTRACTION). Search/list pages and tripadvisor.co.* country domains are not supported — use .com.
How many reviews can I scrape per run?
As many as your Apify plan and maxItems / maxItemsPerQuery allow. The actor paginates TripAdvisor's GraphQL API in batches of 10 per request, in parallel where possible.
What does scrapeReviewerInfo actually change?
The only field that changes is user. With true (default) the user object is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). With false it's null. All other review fields (id, lang, subratings, photos, helpfulVotes, tripType, …) are emitted in both modes.
What date formats does lastReviewDate accept?
Both absolute (2026-01-01, ISO YYYY-MM-DD) and relative durations: "22 days", "3 weeks", "6 months", "1 year" (singular and plural both work). Reviews older than the resolved date are dropped client‑side, with an early‑break optimisation that stops paginating once an entire page is older than the cutoff.
Why is lang sometimes different from the on‑site language toggle?
TripAdvisor machine‑translates reviews. We always emit the source language (originalLanguage), not the displayed translation, so analytics groupings stay accurate.
Can I scrape multiple places in one run?
Yes — pass any number of startUrls. Hotels, restaurants and attractions can be mixed in the same run; they're processed concurrently up to maxConcurrency. Intra‑place pagination is auto‑capped to 1 in that case so the GraphQL endpoint isn't hammered with concurrency × pages requests.
Where do failed locations end up?
In a named dataset called tripadvisor-failures with rows like { url, locationId, locationType, reason, message, timestamp }. The reason field is one of invalid-location-id, html-blocked, datadome-block, reviews-fetch-failed, no-reviews-saved. The Apify run page surfaces a direct link.
Is private content accessible? No. The actor only sees public TripAdvisor pages and the public review GraphQL endpoint.
📬 Support
- Bug reports / feature requests: use the Issues tab on the actor page.
- Direct contact: muhamed.didovic@gmail.com
- Author's website: muhamed-didovic.github.io
🛠️ Additional services
- Custom output schemas, one‑off datasets, scheduled exports.
- Other platforms (Booking, Yelp, Google Maps, etc.) on request.
- API integrations and automation pipelines.
Email: muhamed.didovic@gmail.com.
🔭 Explore more scrapers
If this actor was useful, browse other scrapers from memo23 on Apify.
⚖️ Legal & compliance
This scraper targets publicly accessible TripAdvisor content for legitimate research, monitoring, and analytics. You are responsible for:
- Complying with TripAdvisor's terms of service.
- Respecting
robots.txtand reasonable rate limits. - Using scraped data lawfully (privacy, copyright, GDPR/CCPA where applicable).
- Obtaining any necessary permissions for commercial reuse of the data.