TripAdvisor Reviews [Only $0.45💰] Scraper avatar

TripAdvisor Reviews [Only $0.45💰] Scraper

Pricing

from $0.45 / 1,000 results

Go to Apify Store
TripAdvisor Reviews [Only $0.45💰] Scraper

TripAdvisor Reviews [Only $0.45💰] Scraper

💰$0.45 per 1,000 Scrape TripAdvisor hotel reviews: title, rating, language, text, dates, owner response, photos, sub-ratings, and optional reviewer profiles. Each review is enriched with place metadata (rating, address, geo, website, histogram). Filter by rating, language, date and per-place limit

Pricing

from $0.45 / 1,000 results

Rating

0.0

(0)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

20 hours ago

Last modified

Share

TripAdvisor Reviews Scraper

Turn TripAdvisor hotel pages into structured review datasets. Pull every public review for a property — title, rating, language, text, travel date, owner response, photos, sub‑ratings — already enriched with the host place's full metadata (address, geo, ranking, ratingHistogram). One run, one clean dataset.

How it works

How TripAdvisor Reviews Scraper Works

✨ Why use this scraper?

Manually opening hotel pages and copying reviews? Stitching together separate "reviews" and "place details" scrapes? Getting blocked by DataDome the moment you scale?

  • 🏨 Reviews + place metadata in the same row. Every review already carries placeInfo (rating, address, lat/lng, ranking, histogram). No follow‑up enrichment.
  • 🎯 Server‑side filters wired through TripAdvisor's GraphQL. Star ratings, languages, per‑place limits and dates are pushed down to the API — you get back what you asked for.
  • 📅 Absolute or relative date cutoff. "2026-01-01" or "22 days", "3 weeks", "6 months", "1 year" — all valid for lastReviewDate.
  • 👤 Optional reviewer profiles. Flip scrapeReviewerInfo to switch from the lean review‑centric output to a reviewer‑centric output with username, hometown, contributions, avatar, profile link.
  • 🛡️ Hardened anti‑bot path. Mobile‑Safari UA fallback through undici for HTML, real browser fingerprinting via ImpitHttpClient for the GraphQL endpoint, single‑shot DataDome detection.
  • 📑 Per‑location failure dataset. Skipped or blocked hotels land in a side dataset (tripadvisor-failures) instead of getting buried in logs.
  • Parallel pagination. Each hotel pages through GraphQL in concurrent batches (default 3), respecting your global and per‑place caps.

🎯 Use cases

TeamWhat they build
Hotel opsDaily review monitoring + owner‑response SLA tracking
Reputation managersMulti‑property reputation dashboards with ratingHistogram drift over time
Market analystsCompetitive benchmarks across cities or chains using placeInfo.rating + numberOfReviews
Content / NLP teamsMultilingual review corpora for sentiment and topic models, filtered by language and rating
Travel mediaCurated "best of" articles backed by recent verified reviews
Data teamsOne‑shot dataset exports for BI, lake or warehouse ingestion (JSON, CSV, Excel)

🔧 How it works (pipeline)

  1. Detect location ID from each *_Review-... URL (-d{id}-) — works for hotels, restaurants and attractions.
  2. Fetch the place HTML with a mobile‑Safari User‑Agent. Falls back to a direct undici fetch when DataDome blocks the desktop fingerprint, and to URL‑derived placeInfo when even that is blocked (reviews still come through over GraphQL).
  3. Extract placeInfo from the page's JSON‑LD + meta description (rating, review count, address, geo, ranking position).
  4. Page through GraphQL reviews at /data/graphql/ids in concurrent batches, with reviewRatings / reviewsLanguages / maxItemsPerQuery pushed into the filter payload.
  5. Apply lastReviewDate client‑side after each page, and break early once an entire page is older than the cutoff.
  6. Map and push each review enriched with placeInfo to the default dataset; skipped or blocked places go to tripadvisor-failures.

📥 Supported inputs

Currently supported start URLs (mix and match in a single run):

PatternExample
Hotel_Review-g{geoId}-d{locationId}-Reviews-{slug}.htmlhttps://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html
Restaurant_Review-g{geoId}-d{locationId}-Reviews-{slug}.htmlhttps://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html
Attraction_Review-g{geoId}-d{locationId}-Reviews-{slug}.htmlhttps://www.tripadvisor.com/Attraction_Review-g60763-d105123-Reviews-Statue_of_Liberty-New_York_City_New_York.html

All three share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL, EATERY, ATTRACTION).

Not currently supported:

  • Search / list pages (FindRestaurants?..., Hotels-g…, Attractions-g…).
  • tripadvisor.co.* country domains (use .com).
  • Non‑TripAdvisor hosts.

⚙️ Input parameters

Required

ParameterTypeDefaultDescription
startUrlsarray of { url }One or more TripAdvisor place URLs (Hotel_Review-…, Restaurant_Review-…, or Attraction_Review-…). Run them in parallel up to maxConcurrency.

Filters

ParameterTypeDefaultDescription
maxItemsPerQueryinteger50Max reviews to fetch per place. 0 = unlimited (paginate to the end).
reviewRatingsarray[] (all)Star ratings to keep. Values: "1", "2", "3", "4", "5", or "ALL_REVIEW_RATINGS". Pushed down into the GraphQL filter payload.
reviewsLanguagesarray[] (all)ISO 639‑1 codes (e.g. ["en", "de", "fr"]) or "ALL_REVIEW_LANGUAGES". Pushed down into the GraphQL filter payload.
lastReviewDatestringSkip reviews published before this date. Accepts an absolute date YYYY-MM-DD or a relative duration: "22 days", "2 weeks", "3 months", "1 year" (singular or plural).
scrapeReviewerInfobooleantrueWhen true, the user object on each review is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). When false, user is null. The rest of the review fields (id, lang, helpfulVotes, tripType, subratings, photos, …) are emitted in both modes.

Advanced

ParameterTypeDefaultDescription
maxItemsinteger10000Hard ceiling on the total number of items written to the dataset across all places. 0 = no global cap.
maxConcurrencyinteger100Max start URLs (hotels) processed concurrently.
minConcurrencyinteger1Crawler floor.
maxRequestRetriesinteger15 (range 050)Retries per request before giving up. Lower = surface failures fast, higher = absorb transient anti‑bot blocks.
proxyobject{ useApifyProxy: true, apifyProxyGroups: ["RESIDENTIAL"] }Standard Apify proxy block. Apify Residential is strongly recommended for TripAdvisor.

📊 Output overview

Each dataset row is one review enriched with the host place's placeInfo. The full review schema is emitted in both scrapeReviewerInfo modes — the only difference is whether the user object is populated (true) or set to null (false).

Review schema (both modes)

{
"id": "1003456789",
"url": "https://www.tripadvisor.com/ShowUserReviews-g60763-d208453-r1003456789-Hilton_New_York_Times_Square-New_York_City_New_York.html",
"title": "Great location, friendly staff",
"lang": "en",
"language": "en",
"originalLanguage": "en",
"locationId": "208453",
"publishedDate": "2026-03-14",
"publishedPlatform": "OTHER",
"rating": 5,
"helpfulVotes": 2,
"text": "We stayed three nights and...",
"travelDate": "2026-03",
"stayDate": "2026-03-14",
"tripType": "COUPLES",
"user": null,
"ownerResponse": {
"id": "987654",
"text": "Thank you for staying with us...",
"lang": "en",
"publishedDate": "2026-03-16",
"responder": "Hilton Times Square",
"connectionToSubject": "Manager"
},
"subratings": [
{ "name": "Service", "value": 5 },
{ "name": "Cleanliness", "value": 5 },
{ "name": "Value", "value": 4 },
{ "name": "Location", "value": 5 },
{ "name": "Rooms", "value": 4 },
{ "name": "Sleep Quality", "value": 5 }
],
"photos": [
{ "id": "812340000", "width": 4032, "height": 3024, "image": "https://media-cdn.tripadvisor.com/media/photo-o/30/68/0c/00/lobby.jpg" }
],
"placeInfo": {
"id": "208453",
"name": "Hilton New York Times Square",
"rating": 4.0,
"numberOfReviews": 8944,
"locationString": "New York City, New York",
"latitude": 40.756,
"longitude": -73.989,
"webUrl": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html",
"website": "https://www.hilton.com/...",
"address": "234 W 42nd St, New York City, NY 10036",
"addressObj": {
"street1": "234 W 42nd St",
"city": "New York City",
"state": "NY",
"postalcode": "10036",
"country": "United States"
},
"ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }
}
}

With scrapeReviewerInfo: true the only field that changes is user — it is replaced with the reviewer's profile object:

{
"user": {
"userId": "ABCDEF12345",
"name": "Traveler 123",
"contributions": { "totalContributions": 42 },
"username": "traveler123",
"userLocation": { "shortName": "Berlin", "name": "Berlin, Germany", "id": "187323" },
"avatar": { "id": "5567890", "width": 200, "height": 200, "image": "https://media-cdn.tripadvisor.com/media/photo-l/...jpg" },
"link": "www.tripadvisor.com/Profile/traveler123"
}
}

🗂️ Output fields

Review core

All fields below are emitted in both scrapeReviewerInfo modes.

FieldTypeDescription
idstringTripAdvisor review ID
urlstringPermalink to the review on TripAdvisor
titlestringReview headline
textstringReview body
ratinginteger (1–5)Stars left by the reviewer
langstring (ISO 639‑1)Source language (preserves originalLanguage for machine‑translated reviews)
languagestring (ISO 639‑1)Current API language (may be the translated language)
originalLanguagestring (ISO 639‑1)Source language before machine translation
publishedDatestring (ISO date)When the review was published
publishedPlatformstringSource platform (e.g. OTHER, MOBILE)
helpfulVotesintegerHelpful‑vote count
travelDatestring (YYYY-MM)Month/year of the stay
stayDatestring (YYYY-MM-DD)Full stay date
tripTypestring | nullCOUPLES, FAMILY, BUSINESS, SOLO, FRIENDS, …
locationIdstringTripAdvisor location ID for the place

Owner response

FieldTypeDescription
ownerResponse.idstringResponse ID
ownerResponse.textstringResponse body
ownerResponse.langstringResponse language
ownerResponse.publishedDatestringResponse date
ownerResponse.responderstringDisplay name (e.g. property name or manager)
ownerResponse.connectionToSubjectstringManager, Owner, etc.

Subratings

subratings is an array of { name, value }:

FieldTypeDescription
subratings[].namestringService, Cleanliness, Value, Location, Rooms, Sleep Quality, …
subratings[].valueinteger (1–5)Per‑aspect rating

Photos

photos is an array of { id, width, height, image }:

FieldTypeDescription
photos[].idstringTripAdvisor photo ID
photos[].width / heightintegerNative pixel dimensions
photos[].imagestringBare CDN URL (no ?w=... query)

Reviewer (only when scrapeReviewerInfo: true)

FieldTypeDescription
user.userIdstring | nullInternal TripAdvisor user ID
user.namestring | nullDisplay name
user.usernamestring | nullURL handle
user.contributions.totalContributionsintegerLifetime contribution count
user.userLocationobject | null{ shortName, name, id } resolved from hometown
user.avatarobject | null{ id, width, height, image }
user.linkstring | nullProfile link without scheme (www.tripadvisor.com/Profile/...)

Place metadata (placeInfo)

FieldTypeDescription
placeInfo.idstringNumeric location ID
placeInfo.namestringHotel name
placeInfo.ratingnumberAggregate 1–5 rating
placeInfo.numberOfReviewsintegerTotal review count on TripAdvisor
placeInfo.locationStringstringHuman‑readable city/state/country
placeInfo.latitude / longitudenumberGeo coordinates
placeInfo.addressstringSingle‑line address
placeInfo.addressObjobject{ street1, street2, city, state, postalcode, country }
placeInfo.websitestringHotel's official website
placeInfo.webUrlstringTripAdvisor URL
placeInfo.ratingHistogramobject{ count1, count2, count3, count4, count5 }

🚀 Examples

Single hotel, defaults

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
],
"maxItemsPerQuery": 50
}

Multiple hotels, English‑only, last 30 days

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d93589-Reviews-The_Manhattan_at_Times_Square_Hotel-New_York_City_New_York.html" }
],
"maxItemsPerQuery": 100,
"reviewsLanguages": ["en"],
"lastReviewDate": "30 days"
}

Recent 3‑star reviews with full reviewer profiles

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
],
"maxItemsPerQuery": 50,
"reviewRatings": ["3"],
"reviewsLanguages": ["en"],
"lastReviewDate": "2026-01-01",
"scrapeReviewerInfo": true
}

Mix hotels and restaurants in one run

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
{ "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }
],
"maxItemsPerQuery": 50,
"reviewsLanguages": ["en"]
}

💻 Integrations

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("muhamed-didovic/apify-tripadvisor").call(run_input={
"startUrls": [
{"url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html"},
],
"maxItemsPerQuery": 50,
"reviewsLanguages": ["en"],
"lastReviewDate": "30 days",
})
for review in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{review['rating']}{review['title']} ({review['lang']})")
print(f" place: {review['placeInfo']['name']} rating: {review['placeInfo']['rating']}")

JavaScript / Node

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('muhamed-didovic/apify-tripadvisor').call({
startUrls: [
{ url: 'https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html' },
],
maxItemsPerQuery: 50,
scrapeReviewerInfo: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((r) => {
console.log(`${r.rating}${r.title} by ${r.user?.username ?? 'anon'}`);
});

📈 Performance

MetricValue
Reviews per hotel page (GraphQL)10
Concurrent review pages per hotelup to 3 (auto‑capped to 1 when many hotels run in parallel)
Anti‑bot pathMobile‑Safari UA + ImpitHttpClient GraphQL fingerprint, single‑shot DataDome detection
Failure visibilityPer‑location rows in tripadvisor-failures named dataset
Output formatsJSON, CSV, Excel, RSS, XML, HTML (via Apify dataset views)

💡 Tips for best results

  1. Start with one place and maxItemsPerQuery: 10 to validate the schema. Works the same for hotels, restaurants and attractions — confirm the fields you care about land in the dataset before launching big runs.
  2. Use server‑side filters. reviewRatings and reviewsLanguages are pushed into the GraphQL payload — much cheaper than fetching everything and filtering downstream.
  3. Prefer relative dates over absolute when running on a schedule. "30 days" always means "last 30 days from now"; an absolute 2026-01-01 will silently widen as time passes.
  4. Lower maxRequestRetries for fast feedback. The default 15 absorbs anti‑bot blocks but can hide a misconfigured proxy. Try 3 while iterating, raise it for production.
  5. Toggle scrapeReviewerInfo only when you need profiles. All review fields are kept in both modes; the only thing that changes is whether user is populated. Disabling it makes the dataset smaller and the run faster.
  6. Check the tripadvisor-failures dataset after each run. Empty = perfect run. Rows = location‑level issues you can replay.

❓ FAQ

Which TripAdvisor URLs are supported? Hotel_Review-…, Restaurant_Review-…, and Attraction_Review-… URLs on www.tripadvisor.com (anything that follows the …-g{geoId}-d{locationId}-Reviews-… slug pattern). They share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL / EATERY / ATTRACTION). Search/list pages and tripadvisor.co.* country domains are not supported — use .com.

How many reviews can I scrape per run? As many as your Apify plan and maxItems / maxItemsPerQuery allow. The actor paginates TripAdvisor's GraphQL API in batches of 10 per request, in parallel where possible.

What does scrapeReviewerInfo actually change? The only field that changes is user. With true (default) the user object is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). With false it's null. All other review fields (id, lang, subratings, photos, helpfulVotes, tripType, …) are emitted in both modes.

What date formats does lastReviewDate accept? Both absolute (2026-01-01, ISO YYYY-MM-DD) and relative durations: "22 days", "3 weeks", "6 months", "1 year" (singular and plural both work). Reviews older than the resolved date are dropped client‑side, with an early‑break optimisation that stops paginating once an entire page is older than the cutoff.

Why is lang sometimes different from the on‑site language toggle? TripAdvisor machine‑translates reviews. We always emit the source language (originalLanguage), not the displayed translation, so analytics groupings stay accurate.

Can I scrape multiple places in one run? Yes — pass any number of startUrls. Hotels, restaurants and attractions can be mixed in the same run; they're processed concurrently up to maxConcurrency. Intra‑place pagination is auto‑capped to 1 in that case so the GraphQL endpoint isn't hammered with concurrency × pages requests.

Where do failed locations end up? In a named dataset called tripadvisor-failures with rows like { url, locationId, locationType, reason, message, timestamp }. The reason field is one of invalid-location-id, html-blocked, datadome-block, reviews-fetch-failed, no-reviews-saved. The Apify run page surfaces a direct link.

Is private content accessible? No. The actor only sees public TripAdvisor pages and the public review GraphQL endpoint.

📬 Support

🛠️ Additional services

  • Custom output schemas, one‑off datasets, scheduled exports.
  • Other platforms (Booking, Yelp, Google Maps, etc.) on request.
  • API integrations and automation pipelines.

Email: muhamed.didovic@gmail.com.

🔭 Explore more scrapers

If this actor was useful, browse other scrapers from memo23 on Apify.

This scraper targets publicly accessible TripAdvisor content for legitimate research, monitoring, and analytics. You are responsible for:

  • Complying with TripAdvisor's terms of service.
  • Respecting robots.txt and reasonable rate limits.
  • Using scraped data lawfully (privacy, copyright, GDPR/CCPA where applicable).
  • Obtaining any necessary permissions for commercial reuse of the data.