TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews avatar

TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews

Pricing

from $0.45 / 1,000 results

Go to Apify Store
TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews

TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews

💰$0.45 per 1,000 Scrape TripAdvisor hotel reviews: title, rating, language, text, dates, owner response, photos, sub-ratings, and optional reviewer profiles. Each review is enriched with place metadata (rating, address, geo, website, histogram). Filter by rating, language, date and per-place limit

Pricing

from $0.45 / 1,000 results

Rating

5.0

(2)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

18

Total users

15

Monthly active users

3 days ago

Last modified

Share

TripAdvisor Reviews Scraper

Turn TripAdvisor hotel pages into structured review datasets. Pull every public review for a property — title, rating, language, text, travel date, owner response, photos, sub‑ratings — already enriched with the host place's full metadata (address, geo, ranking, ratingHistogram). One run, one clean dataset.

How it works

How TripAdvisor Reviews Scraper Works

✨ Why use this scraper?

Manually opening hotel pages and copying reviews? Stitching together separate "reviews" and "place details" scrapes? Getting blocked by DataDome the moment you scale?

  • 🏨 Reviews + place metadata in the same row. Every review already carries placeInfo (rating, address, lat/lng, ranking, histogram). No follow‑up enrichment.
  • 🎯 Server‑side filters wired through TripAdvisor's GraphQL. Star ratings, languages, per‑place limits and dates are pushed down to the API — you get back what you asked for.
  • 📅 Absolute or relative date cutoff. "2026-01-01" or "22 days", "3 weeks", "6 months", "1 year" — all valid for lastReviewDate.
  • 👤 Optional reviewer profiles. Flip scrapeReviewerInfo to switch from the lean review‑centric output to a reviewer‑centric output with username, hometown, contributions, avatar, profile link.
  • 🧩 Three output modes. Default flat is one row per review (good for tabular consumers). outputShape: "nested" collapses each place into a single row with reviews[] nested. scrapeReviews: false skips reviews entirely and emits a place‑only snapshot — fast, low‑cost.
  • 🛡️ Hardened anti‑bot path. Mobile‑Safari UA fallback through undici for HTML, real browser fingerprinting via ImpitHttpClient for the GraphQL endpoint, single‑shot DataDome detection.
  • 📑 Per‑location failure dataset. Skipped or blocked hotels land in a side dataset (tripadvisor-failures) instead of getting buried in logs.
  • Parallel pagination. Each hotel pages through GraphQL in concurrent batches (default 3), respecting your global and per‑place caps.

🎯 Use cases

TeamWhat they build
Hotel opsDaily review monitoring + owner‑response SLA tracking
Reputation managersMulti‑property reputation dashboards with ratingHistogram drift over time
Market analystsCompetitive benchmarks across cities or chains using placeInfo.rating + numberOfReviews
Content / NLP teamsMultilingual review corpora for sentiment and topic models, filtered by language and rating
Travel mediaCurated "best of" articles backed by recent verified reviews
Data teamsOne‑shot dataset exports for BI, lake or warehouse ingestion (JSON, CSV, Excel)

🔧 How it works (pipeline)

  1. Detect location ID from each *_Review-... URL (-d{id}-) — works for hotels, restaurants and attractions.
  2. Fetch the place HTML with a mobile‑Safari User‑Agent. Falls back to a direct undici fetch when DataDome blocks the desktop fingerprint, and to URL‑derived placeInfo when even that is blocked (reviews still come through over GraphQL).
  3. Extract placeInfo from the page's JSON‑LD + meta description (rating, review count, address, geo, ranking position).
  4. Page through GraphQL reviews at /data/graphql/ids in concurrent batches, with reviewRatings / reviewsLanguages / maxItems pushed into the filter payload.
  5. Apply lastReviewDate client‑side after each page, and break early once an entire page is older than the cutoff.
  6. Map and push each review enriched with placeInfo to the default dataset; skipped or blocked places go to tripadvisor-failures.

📥 Supported inputs

Currently supported start URLs (mix and match in a single run):

PatternExample
Hotel_Review-g{geoId}-d{locationId}-Reviews-{slug}.htmlhttps://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html
Restaurant_Review-g{geoId}-d{locationId}-Reviews-{slug}.htmlhttps://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html
Attraction_Review-g{geoId}-d{locationId}-Reviews-{slug}.htmlhttps://www.tripadvisor.com/Attraction_Review-g60763-d105123-Reviews-Statue_of_Liberty-New_York_City_New_York.html
Restaurants-g{geo}-…html GEO restaurant hub — expands to each matching restauranthttps://www.tripadvisor.com/Restaurants-g35805-Chicago_Illinois.html
Hotels-g{geo}-…-Hotels.html GEO hotel hub — expands to each matching hotelhttps://www.tripadvisor.com/Hotels-g60763-New_York_City_New_York-Hotels.html
Attractions-g{geo}-Activities-…html GEO attractions hub — expands to each matching attraction (optional -cNN- category)https://www.tripadvisor.com/Attractions-g60763-Activities-New_York_City_New_York.html
FindRestaurants search URL — expands to each matching restauranthttps://www.tripadvisor.com/FindRestaurants?geo=188673&establishmentTypes=10591&mealTypes=10597&broadened=false

Place detail URLs share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL, EATERY, ATTRACTION). A FindRestaurants start URL is resolved via TripAdvisor’s listing GraphQL into many Restaurant_Review venues (paginated, up to 10 × 30 = 300 venues per search by default); then each venue is scraped like a standalone restaurant URL — maxItems applies per restaurant discovered from that listing (subject to your plan’s overall cap).

Not currently supported:

  • Search / hub pages other than FindRestaurants (for example generic Hotels-g…, Attractions-g…).
  • tripadvisor.co.* country domains (use .com).
  • Non‑TripAdvisor hosts.

⚙️ Input parameters

Discovery (use searchQuery and/or startUrls)

ParameterTypeDefaultDescription
searchQuerystringFree-text location, e.g. "Chicago", "Brooklyn", "London". Resolved to a TripAdvisor geoId via the typeahead GraphQL endpoint, then expanded into venues based on the include* toggles below. Use alongside or instead of startUrls.
startUrlsarray of { url }[]TripAdvisor Hotel_Review, Restaurant_Review, Attraction_Review, and FindRestaurants?… URLs. Listing URLs expand to discovered restaurants; plain place URLs scrape that place directly. Run in parallel up to maxConcurrency.
includeRestaurantsbooleantrueWhen searchQuery is set, include Restaurant_Review venues for the resolved geo. Wired through the existing FindRestaurants?geo=… expansion.
includeHotelsbooleantrueWhen searchQuery is set, include Hotel_Review venues. Wired through the Hotels-g{geo}-…-Hotels.html listing expander (paginated, up to ~300 hotels per query).
includeThingsToDobooleantrueWhen searchQuery is set, include Attraction_Review venues. Wired through the Attractions-g{geo}-Activities-…html listing expander (paginated, up to ~300 attractions per query).
includeNearbybooleanfalseWhen true, after each main place is scraped the actor expands up to 5 nearby venues from the page's nearby carousel as additional place-detail-only snapshot rows tagged with isNearbyResult: true. Reviews are NOT fetched for nearby venues. Depth capped at 1.

One of searchQuery or startUrls should be provided. Empty input produces a clean no-op run.

Filters

ParameterTypeDefaultDescription
maxItemsinteger50Max reviews to fetch per place / per URL. Applied to each entry in startUrls independently — e.g. maxItems: 50 with 3 URLs ⇒ up to 150 reviews total. 0 = unlimited (paginate to the end of each place).
reviewRatingsarray[] (all)Star ratings to keep. Values: "1", "2", "3", "4", "5", or "ALL_REVIEW_RATINGS". Pushed down into the GraphQL filter payload.
reviewsLanguagesarray[] (all)ISO 639‑1 codes (e.g. ["en", "de", "fr"]) or "ALL_REVIEW_LANGUAGES". Pushed down into the GraphQL filter payload.
lastReviewDatestringSkip reviews published before this date. Accepts an absolute date YYYY-MM-DD or a relative duration: "22 days", "2 weeks", "3 months", "1 year" (singular or plural).
scrapeReviewerInfobooleantrueWhen true, the user object on each review is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). When false, user is null. The rest of the review fields (id, lang, helpfulVotes, tripType, subratings, photos, …) are emitted in both modes.

Migration note: maxItemsPerQuery was merged into maxItemsmaxItems is now the per-place cap (it used to be a run-wide ceiling). Saved configs that still pass maxItemsPerQuery keep working: it's accepted as a deprecated alias and overrides maxItems for that run, with a warning logged.

Place vs reviews

ParameterTypeDefaultDescription
scrapeReviewsbooleantrueWhen true (default), paginate reviews via GraphQL up to maxItems per place. When false, skip reviews entirely and emit one dataset row per start URL with { "placeDetailOnly": true, "placeInfo": … } parsed from the page HTML. Best for fast place snapshots.
includeReviewTagsbooleantrueWhen true (default), include placeInfo.reviewTags (theme phrases like "sushi: 14 reviews") on emitted rows when TripAdvisor embeds them. Set false to drop them for smaller payloads.
outputShapestring ("flat" | "nested")"flat"Controls the dataset shape when reviews are scraped. "flat" (default) keeps today's row‑per‑review layout, each row carrying placeInfo. "nested" collapses each place into a single row of the form { "placeDetailOnly": false, "placeInfo": …, "reviews": [...] }. No effect when scrapeReviews: false.

Note on nested mode + billing: under PRICE_PER_DATASET_ITEM (the default Apify pricing model), nested mode bills once per place instead of once per review — so a place with 200 reviews charges 1 dataset item, not 200. If you switch the default, also review your actor's pricing config.

Advanced

ParameterTypeDefaultDescription
maxConcurrencyinteger100Max start URLs (hotels) processed concurrently.
minConcurrencyinteger1Crawler floor.
maxRequestRetriesinteger15 (range 050)Retries per request before giving up. Lower = surface failures fast, higher = absorb transient anti‑bot blocks.
proxyobject{ useApifyProxy: true, apifyProxyGroups: ["RESIDENTIAL"] }Standard Apify proxy block. Apify Residential is strongly recommended for TripAdvisor.

📊 Output overview

The actor emits one of three row shapes depending on scrapeReviews and outputShape. Default settings produce a flat row‑per‑review dataset enriched with the host place's placeInfo — the rest of this section walks the review schema first, then shows how nested and place‑only rows differ.

ModescrapeReviewsoutputShapeRows per placeRow shape
Flat reviews (default)true"flat"up to maxItems{ ...review, placeInfo } — one row per review
Nested reviewstrue"nested"exactly 1{ placeDetailOnly: false, placeInfo, reviews: [...] }
Place snapshotfalse(ignored)exactly 1{ placeDetailOnly: true, placeInfo }

The full review schema below is emitted in both scrapeReviewerInfo modes — the only difference is whether the user object is populated (true) or set to null (false).

Review schema (both modes)

{
"id": "1003456789",
"url": "https://www.tripadvisor.com/ShowUserReviews-g60763-d208453-r1003456789-Hilton_New_York_Times_Square-New_York_City_New_York.html",
"title": "Great location, friendly staff",
"lang": "en",
"language": "en",
"originalLanguage": "en",
"locationId": "208453",
"publishedDate": "2026-03-14",
"publishedPlatform": "OTHER",
"rating": 5,
"helpfulVotes": 2,
"text": "We stayed three nights and...",
"travelDate": "2026-03",
"stayDate": "2026-03-14",
"tripType": "COUPLES",
"user": null,
"ownerResponse": {
"id": "987654",
"text": "Thank you for staying with us...",
"lang": "en",
"publishedDate": "2026-03-16",
"responder": "Hilton Times Square",
"connectionToSubject": "Manager"
},
"subratings": [
{ "name": "Service", "value": 5 },
{ "name": "Cleanliness", "value": 5 },
{ "name": "Value", "value": 4 },
{ "name": "Location", "value": 5 },
{ "name": "Rooms", "value": 4 },
{ "name": "Sleep Quality", "value": 5 }
],
"photos": [
{ "id": "812340000", "width": 4032, "height": 3024, "image": "https://media-cdn.tripadvisor.com/media/photo-o/30/68/0c/00/lobby.jpg" }
],
"placeInfo": {
"id": "208453",
"name": "Hilton New York Times Square",
"rating": 4.0,
"numberOfReviews": 8944,
"locationString": "New York City, New York",
"latitude": 40.756,
"longitude": -73.989,
"webUrl": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html",
"website": "https://www.hilton.com/...",
"path": "/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html",
"phone": "+12125551234",
"address": "234 W 42nd St, New York City, NY 10036",
"addressObj": {
"street1": "234 W 42nd St",
"city": "New York City",
"state": "NY",
"postalcode": "10036",
"country": "United States"
},
"ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }
}
}

With scrapeReviewerInfo: true the only field that changes is user — it is replaced with the reviewer's profile object:

{
"user": {
"userId": "ABCDEF12345",
"name": "Traveler 123",
"contributions": { "totalContributions": 42 },
"username": "traveler123",
"userLocation": { "shortName": "Berlin", "name": "Berlin, Germany", "id": "187323" },
"avatar": { "id": "5567890", "width": 200, "height": 200, "image": "https://media-cdn.tripadvisor.com/media/photo-l/...jpg" },
"link": "www.tripadvisor.com/Profile/traveler123"
}
}

🗂️ Output fields

Review core

All fields below are emitted in both scrapeReviewerInfo modes.

FieldTypeDescription
idstringTripAdvisor review ID
urlstringPermalink to the review on TripAdvisor
titlestringReview headline
textstringReview body
ratinginteger (1–5)Stars left by the reviewer
langstring (ISO 639‑1)Source language (preserves originalLanguage for machine‑translated reviews)
languagestring (ISO 639‑1)Current API language (may be the translated language)
originalLanguagestring (ISO 639‑1)Source language before machine translation
publishedDatestring (ISO date)When the review was published
publishedPlatformstringSource platform (e.g. OTHER, MOBILE)
helpfulVotesintegerHelpful‑vote count
travelDatestring (YYYY-MM)Month/year of the stay
stayDatestring (YYYY-MM-DD)Full stay date
tripTypestring | nullCOUPLES, FAMILY, BUSINESS, SOLO, FRIENDS, …
locationIdstringTripAdvisor location ID for the place

Owner response

FieldTypeDescription
ownerResponse.idstringResponse ID
ownerResponse.textstringResponse body
ownerResponse.langstringResponse language
ownerResponse.publishedDatestringResponse date
ownerResponse.responderstringDisplay name (e.g. property name or manager)
ownerResponse.connectionToSubjectstringManager, Owner, etc.

Subratings

subratings is an array of { name, value }:

FieldTypeDescription
subratings[].namestringService, Cleanliness, Value, Location, Rooms, Sleep Quality, …
subratings[].valueinteger (1–5)Per‑aspect rating

Photos

photos is an array of { id, width, height, image }:

FieldTypeDescription
photos[].idstringTripAdvisor photo ID
photos[].width / heightintegerNative pixel dimensions
photos[].imagestringBare CDN URL (no ?w=... query)

Reviewer (only when scrapeReviewerInfo: true)

FieldTypeDescription
user.userIdstring | nullInternal TripAdvisor user ID
user.namestring | nullDisplay name
user.usernamestring | nullURL handle
user.contributions.totalContributionsintegerLifetime contribution count
user.userLocationobject | null{ shortName, name, id } resolved from hometown
user.avatarobject | null{ id, width, height, image }
user.linkstring | nullProfile link without scheme (www.tripadvisor.com/Profile/...)

Place metadata (placeInfo)

FieldTypeDescription
placeInfo.idstringNumeric location ID
placeInfo.namestringPlace display name (hotel, restaurant, attraction)
placeInfo.ratingnumberAggregate 1–5 rating
placeInfo.numberOfReviewsintegerTotal review count on TripAdvisor
placeInfo.locationStringstringHuman‑readable city/state/country
placeInfo.latitude / longitudenumberGeo coordinates
placeInfo.addressstringSingle‑line address
placeInfo.phonestring | undefinedTelephone when TA exposes it
placeInfo.pathstring | undefinedRelative TripAdvisor path (e.g. /Restaurant_Review-g…-Reviews-…) — often set from FindRestaurants listing GraphQL
placeInfo.webUrlstringFull https://www.tripadvisor.com/… URL for the place
placeInfo.cuisinesstring[] | undefinedCuisine labels when present (listing or HTML)
placeInfo.priceLevelstring | undefinedPrice band, e.g. $$ - $$$
placeInfo.menuUrlstring | undefinedExternal menu / pub chain URL when present on the listing
placeInfo.addressObjobject{ street1, street2, city, state, postalcode, country }
placeInfo.websitestringOfficial business website (when known)
placeInfo.ratingHistogramobject{ count1, count2, count3, count4, count5 }
placeInfo.typestringHOTEL, EATERY, ATTRACTION, …

When you start from a FindRestaurants?… URL, the actor first calls TripAdvisor’s listing GraphQL and seeds each review row’s placeInfo with whatever the listing card returns (name, relative path, lat/lon, cuisines, price band, review count, address, phone, menu URL). The detail HTML pass (when not blocked) can still refine or override overlapping fields.

Alternative output shapes

Set outputShape: "nested" to collapse a place's reviews into a single row:

{
"placeDetailOnly": false,
"placeInfo": {
"id": "208453",
"name": "Hilton New York Times Square",
"rating": 4.0,
"numberOfReviews": 8944,
"address": "234 W 42nd St, New York City, NY 10036",
"ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }
},
"reviews": [
{ "id": "1003456789", "rating": 5, "title": "Great location, friendly staff", "text": "We stayed three nights...", "publishedDate": "2026-03-14", "lang": "en", "...": "..." },
{ "id": "1003456790", "rating": 4, "title": "Solid choice for Times Square", "text": "Rooms were quieter than expected...", "publishedDate": "2026-03-13", "lang": "en", "...": "..." }
]
}

Set scrapeReviews: false (regardless of outputShape) to skip reviews entirely and get a fast place snapshot. The snapshot row carries the full parsed placeInfo — description, amenities, room tips, neighborhood + ancestor + metro context, hotel class, offers, category review scores, cuisines + hours (restaurants), and more — pulled from JSON-LD and the page's embedded redux/apollo state:

{
"placeDetailOnly": true,
"placeInfo": {
"id": "208453",
"name": "Hilton New York Times Square",
"description": "Wake Up to the Best Views in Times Square. Perched above the energy and heart of the city...",
"rating": 4.0,
"rawRanking": 3.898643732070923,
"rankingPosition": 289,
"rankingDenominator": "520",
"rankingString": "#289 of 520 hotels in New York City",
"rankingSource": "HTML",
"numberOfReviews": 8944,
"hotelClass": "4.0",
"hotelClassAttribution": "Classified by Giata.",
"amenities": ["Wifi", "Pool", "Fitness center", "..."],
"categoryReviewScores": [
{ "categoryName": "Service", "score": 4.36, "roundedScore": 4.4 },
{ "categoryName": "Cleanliness", "score": 4.58, "roundedScore": 4.6 }
],
"neighborhoodLocations": [
{ "id": "15565670", "name": "Times Square" },
{ "id": "7102352", "name": "Midtown" }
],
"ancestorLocations": [
{ "id": "60763", "name": "New York City", "subcategory": "City" },
{ "id": "28953", "name": "New York", "abbreviation": "NY", "subcategory": "State" }
],
"nearestMetroStations": [
{ "name": "42nd St – Port Authority", "distance": 0.08, "lines": [{ "lineName": "A" }, { "lineName": "C" }] }
],
"offers": [{ "pricePerNight": 169, "vendor": "Booking.com" }],
"roomTips": [{ "id": "1055595676", "text": "Perfect Location in the Heart of Times Square." }],
"address": "234 W 42nd St, New York City, NY 10036",
"ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 },
"reviewTags": [ { "text": "times square", "reviews": 1988 } ]
}
}

Restaurant snapshots additionally surface cuisines, hours, dishes, mealTypes, dietaryRestrictions, features, menuWebUrl, establishmentTypes, and the open/closed flags (isClosed, isLongClosed, openNowText).

The same rich placeInfo is emitted at the top level when you use outputShape: "nested" — the nested reviews[] underneath carry the slim variant to avoid duplicating tens of fields per review.

Example placeInfo shaped like a listing card (values match what TripAdvisor often returns for a London pub — your dataset may use string id):

{
"id": "944622",
"name": "The Yacht",
"path": "/Restaurant_Review-g186338-d944622-Reviews-The_Yacht-London_England.html",
"webUrl": "https://www.tripadvisor.com/Restaurant_Review-g186338-d944622-Reviews-The_Yacht-London_England.html",
"latitude": 51.48476,
"longitude": -0.003814,
"menuUrl": "https://www.greeneking.co.uk/pubs/greater-london/yacht/menu?utm_source=exnet&utm_medium=locations&utm_campaign=UC_menu",
"cuisines": ["Bar", "British", "Pub"],
"priceLevel": "$$ - $$$",
"rating": 4,
"numberOfReviews": 401,
"address": "5 Crane St Greenwich, London SE10 9NP England",
"phone": "+44 20 8858 0175",
"type": "EATERY"
}

🚀 Examples

Single hotel, defaults

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
],
"maxItems": 50
}

Multiple hotels, English‑only, last 30 days

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d93589-Reviews-The_Manhattan_at_Times_Square_Hotel-New_York_City_New_York.html" }
],
"maxItems": 100,
"reviewsLanguages": ["en"],
"lastReviewDate": "30 days"
}

Recent 3‑star reviews with full reviewer profiles

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
],
"maxItems": 50,
"reviewRatings": ["3"],
"reviewsLanguages": ["en"],
"lastReviewDate": "2026-01-01",
"scrapeReviewerInfo": true
}

Mix hotels and restaurants in one run

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
{ "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }
],
"maxItems": 50,
"reviewsLanguages": ["en"]
}

Place snapshots only (no reviews)

Skip review pagination entirely — useful for bulk place metadata:

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
{ "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }
],
"scrapeReviews": false
}

Each start URL produces exactly one dataset row of { "placeDetailOnly": true, "placeInfo": … }.

Nested output (one row per place, reviews under reviews[])

{
"startUrls": [
{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
],
"maxItems": 200,
"outputShape": "nested",
"reviewsLanguages": ["en"]
}

The dataset will contain exactly one row per place with reviews nested under it — easier to consume when you want a single object per hotel/restaurant rather than joining N review rows back to a place.

Free-text search query

Resolve a city or neighborhood name to a geoId without pasting a URL — combines naturally with the include* toggles:

{
"searchQuery": "Chicago",
"includeRestaurants": true,
"includeHotels": false,
"includeThingsToDo": false,
"maxItems": 30,
"reviewsLanguages": ["en"]
}

The actor calls TripAdvisor's typeahead GraphQL, picks the first GEO/NEIGHBORHOOD result, and seeds the run with FindRestaurants?geo=<id> (when includeRestaurants), Hotels-g<id>-Hotels.html (when includeHotels), and/or Attractions-g<id>-Activities.html (when includeThingsToDo). All three listing types share the same -oa30- HTML pagination pipeline.

FindRestaurants search (listing → many restaurants)

{
"startUrls": [
{
"url": "https://www.tripadvisor.com/FindRestaurants?geo=188673&establishmentTypes=10591&mealTypes=10597&broadened=false"
}
],
"maxItems": 50,
"reviewsLanguages": ["en"]
}

Each review row carries placeInfo seeded from the listing card (coordinates, cuisines, priceLevel, menuUrl, relative path, etc., when TripAdvisor returns them).

💻 Integrations

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("muhamed-didovic/apify-tripadvisor").call(run_input={
"startUrls": [
{"url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html"},
],
"maxItems": 50,
"reviewsLanguages": ["en"],
"lastReviewDate": "30 days",
})
for review in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{review['rating']}{review['title']} ({review['lang']})")
print(f" place: {review['placeInfo']['name']} rating: {review['placeInfo']['rating']}")

JavaScript / Node

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('muhamed-didovic/apify-tripadvisor').call({
startUrls: [
{ url: 'https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html' },
],
maxItems: 50,
scrapeReviewerInfo: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((r) => {
console.log(`${r.rating}${r.title} by ${r.user?.username ?? 'anon'}`);
});

📈 Performance

MetricValue
Reviews per hotel page (GraphQL)10
Concurrent review pages per hotelup to 3 (auto‑capped to 1 when many hotels run in parallel)
Anti‑bot pathMobile‑Safari UA + ImpitHttpClient GraphQL fingerprint, single‑shot DataDome detection
Failure visibilityPer‑location rows in tripadvisor-failures named dataset
Output formatsJSON, CSV, Excel, RSS, XML, HTML (via Apify dataset views)

💡 Tips for best results

  1. Start with one place and maxItems: 10 to validate the schema. Works the same for hotels, restaurants and attractions — confirm the fields you care about land in the dataset before launching big runs.
  2. Use server‑side filters. reviewRatings and reviewsLanguages are pushed into the GraphQL payload — much cheaper than fetching everything and filtering downstream.
  3. Prefer relative dates over absolute when running on a schedule. "30 days" always means "last 30 days from now"; an absolute 2026-01-01 will silently widen as time passes.
  4. Lower maxRequestRetries for fast feedback. The default 15 absorbs anti‑bot blocks but can hide a misconfigured proxy. Try 3 while iterating, raise it for production.
  5. Toggle scrapeReviewerInfo only when you need profiles. All review fields are kept in both modes; the only thing that changes is whether user is populated. Disabling it makes the dataset smaller and the run faster.
  6. Check the tripadvisor-failures dataset after each run. Empty = perfect run. Rows = location‑level issues you can replay.

❓ FAQ

Which TripAdvisor URLs are supported? Hotel_Review-…, Restaurant_Review-…, and Attraction_Review-… URLs on www.tripadvisor.com (anything that follows the …-g{geoId}-d{locationId}-Reviews-… slug pattern). You can also pass a FindRestaurants?geo=…&… search URL from the site’s restaurant finder — the actor expands it to individual restaurant places and scrapes reviews per venue. They share the same review GraphQL endpoint; only placeInfo.type differs. Other search hub pages (e.g. generic Hotels-g… lists) are not supported. Use www.tripadvisor.com — not tripadvisor.co.* mirrors.

How many reviews can I scrape per run? As many as your Apify plan and maxItems allow. maxItems is per place (per Hotel_Review / Restaurant_Review / Attraction_Review URL, and per restaurant after a FindRestaurants listing is expanded). Rough upper bound: maxItems × total_places — where total_places counts every expanded restaurant from searches plus every direct detail URL (startUrls). Runs are still subject to platform limits (e.g. free-tier item caps). The actor paginates reviews in batches of 20 per GraphQL page, in parallel where possible.

What does scrapeReviewerInfo actually change? The only field that changes is user. With true (default) the user object is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). With false it's null. All other review fields (id, lang, subratings, photos, helpfulVotes, tripType, …) are emitted in both modes.

Do I need to know the URL to scrape a city? No. Pass searchQuery: "Chicago" (or any city/neighborhood name) and the actor resolves it to a geoId via TripAdvisor's typeahead, then expands into venues based on the includeRestaurants / includeHotels / includeThingsToDo toggles. Today only includeRestaurants is wired through the listing expander; for hotels and attractions, paste the corresponding Hotel_Review-… / Attraction_Review-… URLs into startUrls directly. You can mix searchQuery and startUrls in the same run.

What's the difference between scrapeReviews: false and outputShape: "nested"? scrapeReviews: false skips all review pagination — you only get the place metadata (one fast row per URL with placeDetailOnly: true). outputShape: "nested" still scrapes reviews but groups them under a single dataset row per place ({ placeDetailOnly: false, placeInfo, reviews: [...] }). Pick false for cheap bulk place snapshots; pick nested when you want reviews + a place‑centric layout for downstream consumers.

What date formats does lastReviewDate accept? Both absolute (2026-01-01, ISO YYYY-MM-DD) and relative durations: "22 days", "3 weeks", "6 months", "1 year" (singular and plural both work). Reviews older than the resolved date are dropped client‑side, with an early‑break optimisation that stops paginating once an entire page is older than the cutoff.

Why is lang sometimes different from the on‑site language toggle? TripAdvisor machine‑translates reviews. We always emit the source language (originalLanguage), not the displayed translation, so analytics groupings stay accurate.

Can I scrape multiple places in one run? Yes — pass any number of startUrls. Hotels, restaurants and attractions can be mixed in the same run; they're processed concurrently up to maxConcurrency. Intra‑place pagination is auto‑capped to 1 in that case so the GraphQL endpoint isn't hammered with concurrency × pages requests.

Where do failed locations end up? In a named dataset called tripadvisor-failures with rows like { url, locationId, locationType, reason, message, timestamp }. The reason field is one of invalid-location-id, html-blocked, datadome-block, reviews-fetch-failed, no-reviews-saved. The Apify run page surfaces a direct link.

Is private content accessible? No. The actor only sees public TripAdvisor pages and the public review GraphQL endpoint.

📬 Support

🛠️ Additional services

  • Custom output schemas, one‑off datasets, scheduled exports.
  • Other platforms (Booking, Yelp, Google Maps, etc.) on request.
  • API integrations and automation pipelines.

Email: muhamed.didovic@gmail.com.

🔭 Explore more scrapers

If this actor was useful, browse other scrapers from memo23 on Apify.

This scraper targets publicly accessible TripAdvisor content for legitimate research, monitoring, and analytics. You are responsible for:

  • Complying with TripAdvisor's terms of service.
  • Respecting robots.txt and reasonable rate limits.
  • Using scraped data lawfully (privacy, copyright, GDPR/CCPA where applicable).
  • Obtaining any necessary permissions for commercial reuse of the data.