TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews
Pricing
from $0.45 / 1,000 results
TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews
💰$0.45 per 1,000 Scrape TripAdvisor hotel reviews: title, rating, language, text, dates, owner response, photos, sub-ratings, and optional reviewer profiles. Each review is enriched with place metadata (rating, address, geo, website, histogram). Filter by rating, language, date and per-place limit
Pricing
from $0.45 / 1,000 results
Rating
5.0
(2)
Developer
Muhamed Didovic
Actor stats
0
Bookmarked
18
Total users
15
Monthly active users
3 days ago
Last modified
Categories
Share
TripAdvisor Reviews Scraper
Turn TripAdvisor hotel pages into structured review datasets. Pull every public review for a property — title, rating, language, text, travel date, owner response, photos, sub‑ratings — already enriched with the host place's full metadata (address, geo, ranking, ratingHistogram). One run, one clean dataset.
How it works

✨ Why use this scraper?
Manually opening hotel pages and copying reviews? Stitching together separate "reviews" and "place details" scrapes? Getting blocked by DataDome the moment you scale?
- 🏨 Reviews + place metadata in the same row. Every review already carries
placeInfo(rating, address, lat/lng, ranking, histogram). No follow‑up enrichment. - 🎯 Server‑side filters wired through TripAdvisor's GraphQL. Star ratings, languages, per‑place limits and dates are pushed down to the API — you get back what you asked for.
- 📅 Absolute or relative date cutoff.
"2026-01-01"or"22 days","3 weeks","6 months","1 year"— all valid forlastReviewDate. - 👤 Optional reviewer profiles. Flip
scrapeReviewerInfoto switch from the lean review‑centric output to a reviewer‑centric output with username, hometown, contributions, avatar, profile link. - 🧩 Three output modes. Default
flatis one row per review (good for tabular consumers).outputShape: "nested"collapses each place into a single row withreviews[]nested.scrapeReviews: falseskips reviews entirely and emits a place‑only snapshot — fast, low‑cost. - 🛡️ Hardened anti‑bot path. Mobile‑Safari UA fallback through
undicifor HTML, real browser fingerprinting viaImpitHttpClientfor the GraphQL endpoint, single‑shot DataDome detection. - 📑 Per‑location failure dataset. Skipped or blocked hotels land in a side dataset (
tripadvisor-failures) instead of getting buried in logs. - ⚡ Parallel pagination. Each hotel pages through GraphQL in concurrent batches (default 3), respecting your global and per‑place caps.
🎯 Use cases
| Team | What they build |
|---|---|
| Hotel ops | Daily review monitoring + owner‑response SLA tracking |
| Reputation managers | Multi‑property reputation dashboards with ratingHistogram drift over time |
| Market analysts | Competitive benchmarks across cities or chains using placeInfo.rating + numberOfReviews |
| Content / NLP teams | Multilingual review corpora for sentiment and topic models, filtered by language and rating |
| Travel media | Curated "best of" articles backed by recent verified reviews |
| Data teams | One‑shot dataset exports for BI, lake or warehouse ingestion (JSON, CSV, Excel) |
🔧 How it works (pipeline)
- Detect location ID from each
*_Review-...URL (-d{id}-) — works for hotels, restaurants and attractions. - Fetch the place HTML with a mobile‑Safari User‑Agent. Falls back to a direct
undicifetch when DataDome blocks the desktop fingerprint, and to URL‑derivedplaceInfowhen even that is blocked (reviews still come through over GraphQL). - Extract
placeInfofrom the page's JSON‑LD + meta description (rating, review count, address, geo, ranking position). - Page through GraphQL reviews at
/data/graphql/idsin concurrent batches, withreviewRatings/reviewsLanguages/maxItemspushed into the filter payload. - Apply
lastReviewDateclient‑side after each page, and break early once an entire page is older than the cutoff. - Map and push each review enriched with
placeInfoto the default dataset; skipped or blocked places go totripadvisor-failures.
📥 Supported inputs
Currently supported start URLs (mix and match in a single run):
| Pattern | Example |
|---|---|
Hotel_Review-g{geoId}-d{locationId}-Reviews-{slug}.html | https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html |
Restaurant_Review-g{geoId}-d{locationId}-Reviews-{slug}.html | https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html |
Attraction_Review-g{geoId}-d{locationId}-Reviews-{slug}.html | https://www.tripadvisor.com/Attraction_Review-g60763-d105123-Reviews-Statue_of_Liberty-New_York_City_New_York.html |
Restaurants-g{geo}-…html GEO restaurant hub — expands to each matching restaurant | https://www.tripadvisor.com/Restaurants-g35805-Chicago_Illinois.html |
Hotels-g{geo}-…-Hotels.html GEO hotel hub — expands to each matching hotel | https://www.tripadvisor.com/Hotels-g60763-New_York_City_New_York-Hotels.html |
Attractions-g{geo}-Activities-…html GEO attractions hub — expands to each matching attraction (optional -cNN- category) | https://www.tripadvisor.com/Attractions-g60763-Activities-New_York_City_New_York.html |
FindRestaurants search URL — expands to each matching restaurant | https://www.tripadvisor.com/FindRestaurants?geo=188673&establishmentTypes=10591&mealTypes=10597&broadened=false |
Place detail URLs share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL, EATERY, ATTRACTION). A FindRestaurants start URL is resolved via TripAdvisor’s listing GraphQL into many Restaurant_Review venues (paginated, up to 10 × 30 = 300 venues per search by default); then each venue is scraped like a standalone restaurant URL — maxItems applies per restaurant discovered from that listing (subject to your plan’s overall cap).
Not currently supported:
- Search / hub pages other than
FindRestaurants(for example genericHotels-g…,Attractions-g…). tripadvisor.co.*country domains (use.com).- Non‑TripAdvisor hosts.
⚙️ Input parameters
Discovery (use searchQuery and/or startUrls)
| Parameter | Type | Default | Description |
|---|---|---|---|
searchQuery | string | — | Free-text location, e.g. "Chicago", "Brooklyn", "London". Resolved to a TripAdvisor geoId via the typeahead GraphQL endpoint, then expanded into venues based on the include* toggles below. Use alongside or instead of startUrls. |
startUrls | array of { url } | [] | TripAdvisor Hotel_Review, Restaurant_Review, Attraction_Review, and FindRestaurants?… URLs. Listing URLs expand to discovered restaurants; plain place URLs scrape that place directly. Run in parallel up to maxConcurrency. |
includeRestaurants | boolean | true | When searchQuery is set, include Restaurant_Review venues for the resolved geo. Wired through the existing FindRestaurants?geo=… expansion. |
includeHotels | boolean | true | When searchQuery is set, include Hotel_Review venues. Wired through the Hotels-g{geo}-…-Hotels.html listing expander (paginated, up to ~300 hotels per query). |
includeThingsToDo | boolean | true | When searchQuery is set, include Attraction_Review venues. Wired through the Attractions-g{geo}-Activities-…html listing expander (paginated, up to ~300 attractions per query). |
includeNearby | boolean | false | When true, after each main place is scraped the actor expands up to 5 nearby venues from the page's nearby carousel as additional place-detail-only snapshot rows tagged with isNearbyResult: true. Reviews are NOT fetched for nearby venues. Depth capped at 1. |
One of
searchQueryorstartUrlsshould be provided. Empty input produces a clean no-op run.
Filters
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 50 | Max reviews to fetch per place / per URL. Applied to each entry in startUrls independently — e.g. maxItems: 50 with 3 URLs ⇒ up to 150 reviews total. 0 = unlimited (paginate to the end of each place). |
reviewRatings | array | [] (all) | Star ratings to keep. Values: "1", "2", "3", "4", "5", or "ALL_REVIEW_RATINGS". Pushed down into the GraphQL filter payload. |
reviewsLanguages | array | [] (all) | ISO 639‑1 codes (e.g. ["en", "de", "fr"]) or "ALL_REVIEW_LANGUAGES". Pushed down into the GraphQL filter payload. |
lastReviewDate | string | — | Skip reviews published before this date. Accepts an absolute date YYYY-MM-DD or a relative duration: "22 days", "2 weeks", "3 months", "1 year" (singular or plural). |
scrapeReviewerInfo | boolean | true | When true, the user object on each review is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). When false, user is null. The rest of the review fields (id, lang, helpfulVotes, tripType, subratings, photos, …) are emitted in both modes. |
Migration note:
maxItemsPerQuerywas merged intomaxItems—maxItemsis now the per-place cap (it used to be a run-wide ceiling). Saved configs that still passmaxItemsPerQuerykeep working: it's accepted as a deprecated alias and overridesmaxItemsfor that run, with a warning logged.
Place vs reviews
| Parameter | Type | Default | Description |
|---|---|---|---|
scrapeReviews | boolean | true | When true (default), paginate reviews via GraphQL up to maxItems per place. When false, skip reviews entirely and emit one dataset row per start URL with { "placeDetailOnly": true, "placeInfo": … } parsed from the page HTML. Best for fast place snapshots. |
includeReviewTags | boolean | true | When true (default), include placeInfo.reviewTags (theme phrases like "sushi: 14 reviews") on emitted rows when TripAdvisor embeds them. Set false to drop them for smaller payloads. |
outputShape | string ("flat" | "nested") | "flat" | Controls the dataset shape when reviews are scraped. "flat" (default) keeps today's row‑per‑review layout, each row carrying placeInfo. "nested" collapses each place into a single row of the form { "placeDetailOnly": false, "placeInfo": …, "reviews": [...] }. No effect when scrapeReviews: false. |
Note on nested mode + billing: under
PRICE_PER_DATASET_ITEM(the default Apify pricing model), nested mode bills once per place instead of once per review — so a place with 200 reviews charges1dataset item, not200. If you switch the default, also review your actor's pricing config.
Advanced
| Parameter | Type | Default | Description |
|---|---|---|---|
maxConcurrency | integer | 100 | Max start URLs (hotels) processed concurrently. |
minConcurrency | integer | 1 | Crawler floor. |
maxRequestRetries | integer | 15 (range 0–50) | Retries per request before giving up. Lower = surface failures fast, higher = absorb transient anti‑bot blocks. |
proxy | object | { useApifyProxy: true, apifyProxyGroups: ["RESIDENTIAL"] } | Standard Apify proxy block. Apify Residential is strongly recommended for TripAdvisor. |
📊 Output overview
The actor emits one of three row shapes depending on scrapeReviews and outputShape. Default settings produce a flat row‑per‑review dataset enriched with the host place's placeInfo — the rest of this section walks the review schema first, then shows how nested and place‑only rows differ.
| Mode | scrapeReviews | outputShape | Rows per place | Row shape |
|---|---|---|---|---|
| Flat reviews (default) | true | "flat" | up to maxItems | { ...review, placeInfo } — one row per review |
| Nested reviews | true | "nested" | exactly 1 | { placeDetailOnly: false, placeInfo, reviews: [...] } |
| Place snapshot | false | (ignored) | exactly 1 | { placeDetailOnly: true, placeInfo } |
The full review schema below is emitted in both scrapeReviewerInfo modes — the only difference is whether the user object is populated (true) or set to null (false).
Review schema (both modes)
{"id": "1003456789","url": "https://www.tripadvisor.com/ShowUserReviews-g60763-d208453-r1003456789-Hilton_New_York_Times_Square-New_York_City_New_York.html","title": "Great location, friendly staff","lang": "en","language": "en","originalLanguage": "en","locationId": "208453","publishedDate": "2026-03-14","publishedPlatform": "OTHER","rating": 5,"helpfulVotes": 2,"text": "We stayed three nights and...","travelDate": "2026-03","stayDate": "2026-03-14","tripType": "COUPLES","user": null,"ownerResponse": {"id": "987654","text": "Thank you for staying with us...","lang": "en","publishedDate": "2026-03-16","responder": "Hilton Times Square","connectionToSubject": "Manager"},"subratings": [{ "name": "Service", "value": 5 },{ "name": "Cleanliness", "value": 5 },{ "name": "Value", "value": 4 },{ "name": "Location", "value": 5 },{ "name": "Rooms", "value": 4 },{ "name": "Sleep Quality", "value": 5 }],"photos": [{ "id": "812340000", "width": 4032, "height": 3024, "image": "https://media-cdn.tripadvisor.com/media/photo-o/30/68/0c/00/lobby.jpg" }],"placeInfo": {"id": "208453","name": "Hilton New York Times Square","rating": 4.0,"numberOfReviews": 8944,"locationString": "New York City, New York","latitude": 40.756,"longitude": -73.989,"webUrl": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html","website": "https://www.hilton.com/...","path": "/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html","phone": "+12125551234","address": "234 W 42nd St, New York City, NY 10036","addressObj": {"street1": "234 W 42nd St","city": "New York City","state": "NY","postalcode": "10036","country": "United States"},"ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }}}
With scrapeReviewerInfo: true the only field that changes is user — it is replaced with the reviewer's profile object:
{"user": {"userId": "ABCDEF12345","name": "Traveler 123","contributions": { "totalContributions": 42 },"username": "traveler123","userLocation": { "shortName": "Berlin", "name": "Berlin, Germany", "id": "187323" },"avatar": { "id": "5567890", "width": 200, "height": 200, "image": "https://media-cdn.tripadvisor.com/media/photo-l/...jpg" },"link": "www.tripadvisor.com/Profile/traveler123"}}
🗂️ Output fields
Review core
All fields below are emitted in both scrapeReviewerInfo modes.
| Field | Type | Description |
|---|---|---|
id | string | TripAdvisor review ID |
url | string | Permalink to the review on TripAdvisor |
title | string | Review headline |
text | string | Review body |
rating | integer (1–5) | Stars left by the reviewer |
lang | string (ISO 639‑1) | Source language (preserves originalLanguage for machine‑translated reviews) |
language | string (ISO 639‑1) | Current API language (may be the translated language) |
originalLanguage | string (ISO 639‑1) | Source language before machine translation |
publishedDate | string (ISO date) | When the review was published |
publishedPlatform | string | Source platform (e.g. OTHER, MOBILE) |
helpfulVotes | integer | Helpful‑vote count |
travelDate | string (YYYY-MM) | Month/year of the stay |
stayDate | string (YYYY-MM-DD) | Full stay date |
tripType | string | null | COUPLES, FAMILY, BUSINESS, SOLO, FRIENDS, … |
locationId | string | TripAdvisor location ID for the place |
Owner response
| Field | Type | Description |
|---|---|---|
ownerResponse.id | string | Response ID |
ownerResponse.text | string | Response body |
ownerResponse.lang | string | Response language |
ownerResponse.publishedDate | string | Response date |
ownerResponse.responder | string | Display name (e.g. property name or manager) |
ownerResponse.connectionToSubject | string | Manager, Owner, etc. |
Subratings
subratings is an array of { name, value }:
| Field | Type | Description |
|---|---|---|
subratings[].name | string | Service, Cleanliness, Value, Location, Rooms, Sleep Quality, … |
subratings[].value | integer (1–5) | Per‑aspect rating |
Photos
photos is an array of { id, width, height, image }:
| Field | Type | Description |
|---|---|---|
photos[].id | string | TripAdvisor photo ID |
photos[].width / height | integer | Native pixel dimensions |
photos[].image | string | Bare CDN URL (no ?w=... query) |
Reviewer (only when scrapeReviewerInfo: true)
| Field | Type | Description |
|---|---|---|
user.userId | string | null | Internal TripAdvisor user ID |
user.name | string | null | Display name |
user.username | string | null | URL handle |
user.contributions.totalContributions | integer | Lifetime contribution count |
user.userLocation | object | null | { shortName, name, id } resolved from hometown |
user.avatar | object | null | { id, width, height, image } |
user.link | string | null | Profile link without scheme (www.tripadvisor.com/Profile/...) |
Place metadata (placeInfo)
| Field | Type | Description |
|---|---|---|
placeInfo.id | string | Numeric location ID |
placeInfo.name | string | Place display name (hotel, restaurant, attraction) |
placeInfo.rating | number | Aggregate 1–5 rating |
placeInfo.numberOfReviews | integer | Total review count on TripAdvisor |
placeInfo.locationString | string | Human‑readable city/state/country |
placeInfo.latitude / longitude | number | Geo coordinates |
placeInfo.address | string | Single‑line address |
placeInfo.phone | string | undefined | Telephone when TA exposes it |
placeInfo.path | string | undefined | Relative TripAdvisor path (e.g. /Restaurant_Review-g…-Reviews-…) — often set from FindRestaurants listing GraphQL |
placeInfo.webUrl | string | Full https://www.tripadvisor.com/… URL for the place |
placeInfo.cuisines | string[] | undefined | Cuisine labels when present (listing or HTML) |
placeInfo.priceLevel | string | undefined | Price band, e.g. $$ - $$$ |
placeInfo.menuUrl | string | undefined | External menu / pub chain URL when present on the listing |
placeInfo.addressObj | object | { street1, street2, city, state, postalcode, country } |
placeInfo.website | string | Official business website (when known) |
placeInfo.ratingHistogram | object | { count1, count2, count3, count4, count5 } |
placeInfo.type | string | HOTEL, EATERY, ATTRACTION, … |
When you start from a FindRestaurants?… URL, the actor first calls TripAdvisor’s listing GraphQL and seeds each review row’s placeInfo with whatever the listing card returns (name, relative path, lat/lon, cuisines, price band, review count, address, phone, menu URL). The detail HTML pass (when not blocked) can still refine or override overlapping fields.
Alternative output shapes
Set outputShape: "nested" to collapse a place's reviews into a single row:
{"placeDetailOnly": false,"placeInfo": {"id": "208453","name": "Hilton New York Times Square","rating": 4.0,"numberOfReviews": 8944,"address": "234 W 42nd St, New York City, NY 10036","ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }},"reviews": [{ "id": "1003456789", "rating": 5, "title": "Great location, friendly staff", "text": "We stayed three nights...", "publishedDate": "2026-03-14", "lang": "en", "...": "..." },{ "id": "1003456790", "rating": 4, "title": "Solid choice for Times Square", "text": "Rooms were quieter than expected...", "publishedDate": "2026-03-13", "lang": "en", "...": "..." }]}
Set scrapeReviews: false (regardless of outputShape) to skip reviews entirely and get a fast place snapshot. The snapshot row carries the full parsed placeInfo — description, amenities, room tips, neighborhood + ancestor + metro context, hotel class, offers, category review scores, cuisines + hours (restaurants), and more — pulled from JSON-LD and the page's embedded redux/apollo state:
{"placeDetailOnly": true,"placeInfo": {"id": "208453","name": "Hilton New York Times Square","description": "Wake Up to the Best Views in Times Square. Perched above the energy and heart of the city...","rating": 4.0,"rawRanking": 3.898643732070923,"rankingPosition": 289,"rankingDenominator": "520","rankingString": "#289 of 520 hotels in New York City","rankingSource": "HTML","numberOfReviews": 8944,"hotelClass": "4.0","hotelClassAttribution": "Classified by Giata.","amenities": ["Wifi", "Pool", "Fitness center", "..."],"categoryReviewScores": [{ "categoryName": "Service", "score": 4.36, "roundedScore": 4.4 },{ "categoryName": "Cleanliness", "score": 4.58, "roundedScore": 4.6 }],"neighborhoodLocations": [{ "id": "15565670", "name": "Times Square" },{ "id": "7102352", "name": "Midtown" }],"ancestorLocations": [{ "id": "60763", "name": "New York City", "subcategory": "City" },{ "id": "28953", "name": "New York", "abbreviation": "NY", "subcategory": "State" }],"nearestMetroStations": [{ "name": "42nd St – Port Authority", "distance": 0.08, "lines": [{ "lineName": "A" }, { "lineName": "C" }] }],"offers": [{ "pricePerNight": 169, "vendor": "Booking.com" }],"roomTips": [{ "id": "1055595676", "text": "Perfect Location in the Heart of Times Square." }],"address": "234 W 42nd St, New York City, NY 10036","ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 },"reviewTags": [ { "text": "times square", "reviews": 1988 } ]}}
Restaurant snapshots additionally surface cuisines, hours, dishes, mealTypes, dietaryRestrictions, features, menuWebUrl, establishmentTypes, and the open/closed flags (isClosed, isLongClosed, openNowText).
The same rich
placeInfois emitted at the top level when you useoutputShape: "nested"— the nestedreviews[]underneath carry the slim variant to avoid duplicating tens of fields per review.
Example placeInfo shaped like a listing card (values match what TripAdvisor often returns for a London pub — your dataset may use string id):
{"id": "944622","name": "The Yacht","path": "/Restaurant_Review-g186338-d944622-Reviews-The_Yacht-London_England.html","webUrl": "https://www.tripadvisor.com/Restaurant_Review-g186338-d944622-Reviews-The_Yacht-London_England.html","latitude": 51.48476,"longitude": -0.003814,"menuUrl": "https://www.greeneking.co.uk/pubs/greater-london/yacht/menu?utm_source=exnet&utm_medium=locations&utm_campaign=UC_menu","cuisines": ["Bar", "British", "Pub"],"priceLevel": "$$ - $$$","rating": 4,"numberOfReviews": 401,"address": "5 Crane St Greenwich, London SE10 9NP England","phone": "+44 20 8858 0175","type": "EATERY"}
🚀 Examples
Single hotel, defaults
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }],"maxItems": 50}
Multiple hotels, English‑only, last 30 days
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d93589-Reviews-The_Manhattan_at_Times_Square_Hotel-New_York_City_New_York.html" }],"maxItems": 100,"reviewsLanguages": ["en"],"lastReviewDate": "30 days"}
Recent 3‑star reviews with full reviewer profiles
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }],"maxItems": 50,"reviewRatings": ["3"],"reviewsLanguages": ["en"],"lastReviewDate": "2026-01-01","scrapeReviewerInfo": true}
Mix hotels and restaurants in one run
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },{ "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }],"maxItems": 50,"reviewsLanguages": ["en"]}
Place snapshots only (no reviews)
Skip review pagination entirely — useful for bulk place metadata:
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },{ "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }],"scrapeReviews": false}
Each start URL produces exactly one dataset row of { "placeDetailOnly": true, "placeInfo": … }.
Nested output (one row per place, reviews under reviews[])
{"startUrls": [{ "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }],"maxItems": 200,"outputShape": "nested","reviewsLanguages": ["en"]}
The dataset will contain exactly one row per place with reviews nested under it — easier to consume when you want a single object per hotel/restaurant rather than joining N review rows back to a place.
Free-text search query
Resolve a city or neighborhood name to a geoId without pasting a URL — combines naturally with the include* toggles:
{"searchQuery": "Chicago","includeRestaurants": true,"includeHotels": false,"includeThingsToDo": false,"maxItems": 30,"reviewsLanguages": ["en"]}
The actor calls TripAdvisor's typeahead GraphQL, picks the first GEO/NEIGHBORHOOD result, and seeds the run with FindRestaurants?geo=<id> (when includeRestaurants), Hotels-g<id>-Hotels.html (when includeHotels), and/or Attractions-g<id>-Activities.html (when includeThingsToDo). All three listing types share the same -oa30- HTML pagination pipeline.
FindRestaurants search (listing → many restaurants)
{"startUrls": [{"url": "https://www.tripadvisor.com/FindRestaurants?geo=188673&establishmentTypes=10591&mealTypes=10597&broadened=false"}],"maxItems": 50,"reviewsLanguages": ["en"]}
Each review row carries placeInfo seeded from the listing card (coordinates, cuisines, priceLevel, menuUrl, relative path, etc., when TripAdvisor returns them).
💻 Integrations
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("muhamed-didovic/apify-tripadvisor").call(run_input={"startUrls": [{"url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html"},],"maxItems": 50,"reviewsLanguages": ["en"],"lastReviewDate": "30 days",})for review in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{review['rating']}★ {review['title']} ({review['lang']})")print(f" place: {review['placeInfo']['name']} rating: {review['placeInfo']['rating']}")
JavaScript / Node
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('muhamed-didovic/apify-tripadvisor').call({startUrls: [{ url: 'https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html' },],maxItems: 50,scrapeReviewerInfo: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach((r) => {console.log(`${r.rating}★ ${r.title} by ${r.user?.username ?? 'anon'}`);});
📈 Performance
| Metric | Value |
|---|---|
| Reviews per hotel page (GraphQL) | 10 |
| Concurrent review pages per hotel | up to 3 (auto‑capped to 1 when many hotels run in parallel) |
| Anti‑bot path | Mobile‑Safari UA + ImpitHttpClient GraphQL fingerprint, single‑shot DataDome detection |
| Failure visibility | Per‑location rows in tripadvisor-failures named dataset |
| Output formats | JSON, CSV, Excel, RSS, XML, HTML (via Apify dataset views) |
💡 Tips for best results
- Start with one place and
maxItems: 10to validate the schema. Works the same for hotels, restaurants and attractions — confirm the fields you care about land in the dataset before launching big runs. - Use server‑side filters.
reviewRatingsandreviewsLanguagesare pushed into the GraphQL payload — much cheaper than fetching everything and filtering downstream. - Prefer relative dates over absolute when running on a schedule.
"30 days"always means "last 30 days from now"; an absolute2026-01-01will silently widen as time passes. - Lower
maxRequestRetriesfor fast feedback. The default15absorbs anti‑bot blocks but can hide a misconfigured proxy. Try3while iterating, raise it for production. - Toggle
scrapeReviewerInfoonly when you need profiles. All review fields are kept in both modes; the only thing that changes is whetheruseris populated. Disabling it makes the dataset smaller and the run faster. - Check the
tripadvisor-failuresdataset after each run. Empty = perfect run. Rows = location‑level issues you can replay.
❓ FAQ
Which TripAdvisor URLs are supported?
Hotel_Review-…, Restaurant_Review-…, and Attraction_Review-… URLs on www.tripadvisor.com (anything that follows the …-g{geoId}-d{locationId}-Reviews-… slug pattern). You can also pass a FindRestaurants?geo=…&… search URL from the site’s restaurant finder — the actor expands it to individual restaurant places and scrapes reviews per venue. They share the same review GraphQL endpoint; only placeInfo.type differs. Other search hub pages (e.g. generic Hotels-g… lists) are not supported. Use www.tripadvisor.com — not tripadvisor.co.* mirrors.
How many reviews can I scrape per run?
As many as your Apify plan and maxItems allow. maxItems is per place (per Hotel_Review / Restaurant_Review / Attraction_Review URL, and per restaurant after a FindRestaurants listing is expanded). Rough upper bound: maxItems × total_places — where total_places counts every expanded restaurant from searches plus every direct detail URL (startUrls). Runs are still subject to platform limits (e.g. free-tier item caps). The actor paginates reviews in batches of 20 per GraphQL page, in parallel where possible.
What does scrapeReviewerInfo actually change?
The only field that changes is user. With true (default) the user object is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). With false it's null. All other review fields (id, lang, subratings, photos, helpfulVotes, tripType, …) are emitted in both modes.
Do I need to know the URL to scrape a city?
No. Pass searchQuery: "Chicago" (or any city/neighborhood name) and the actor resolves it to a geoId via TripAdvisor's typeahead, then expands into venues based on the includeRestaurants / includeHotels / includeThingsToDo toggles. Today only includeRestaurants is wired through the listing expander; for hotels and attractions, paste the corresponding Hotel_Review-… / Attraction_Review-… URLs into startUrls directly. You can mix searchQuery and startUrls in the same run.
What's the difference between scrapeReviews: false and outputShape: "nested"?
scrapeReviews: false skips all review pagination — you only get the place metadata (one fast row per URL with placeDetailOnly: true). outputShape: "nested" still scrapes reviews but groups them under a single dataset row per place ({ placeDetailOnly: false, placeInfo, reviews: [...] }). Pick false for cheap bulk place snapshots; pick nested when you want reviews + a place‑centric layout for downstream consumers.
What date formats does lastReviewDate accept?
Both absolute (2026-01-01, ISO YYYY-MM-DD) and relative durations: "22 days", "3 weeks", "6 months", "1 year" (singular and plural both work). Reviews older than the resolved date are dropped client‑side, with an early‑break optimisation that stops paginating once an entire page is older than the cutoff.
Why is lang sometimes different from the on‑site language toggle?
TripAdvisor machine‑translates reviews. We always emit the source language (originalLanguage), not the displayed translation, so analytics groupings stay accurate.
Can I scrape multiple places in one run?
Yes — pass any number of startUrls. Hotels, restaurants and attractions can be mixed in the same run; they're processed concurrently up to maxConcurrency. Intra‑place pagination is auto‑capped to 1 in that case so the GraphQL endpoint isn't hammered with concurrency × pages requests.
Where do failed locations end up?
In a named dataset called tripadvisor-failures with rows like { url, locationId, locationType, reason, message, timestamp }. The reason field is one of invalid-location-id, html-blocked, datadome-block, reviews-fetch-failed, no-reviews-saved. The Apify run page surfaces a direct link.
Is private content accessible? No. The actor only sees public TripAdvisor pages and the public review GraphQL endpoint.
📬 Support
- Bug reports / feature requests: use the Issues tab on the actor page.
- Direct contact: muhamed.didovic@gmail.com
- Author's website: muhamed-didovic.github.io
🛠️ Additional services
- Custom output schemas, one‑off datasets, scheduled exports.
- Other platforms (Booking, Yelp, Google Maps, etc.) on request.
- API integrations and automation pipelines.
Email: muhamed.didovic@gmail.com.
🔭 Explore more scrapers
If this actor was useful, browse other scrapers from memo23 on Apify.
⚖️ Legal & compliance
This scraper targets publicly accessible TripAdvisor content for legitimate research, monitoring, and analytics. You are responsible for:
- Complying with TripAdvisor's terms of service.
- Respecting
robots.txtand reasonable rate limits. - Using scraped data lawfully (privacy, copyright, GDPR/CCPA where applicable).
- Obtaining any necessary permissions for commercial reuse of the data.