Pricing

from $0.45 / 1,000 results

Try for free

Go to Apify Store

TripAdvisor Only $0.45💰 Search | Hotels | Restaurants |Reviews

Try for free

💰$0.45 per 1,000 Scrape TripAdvisor hotel reviews: title, rating, language, text, dates, owner response, photos, sub-ratings, and optional reviewer profiles. Each review is enriched with place metadata (rating, address, geo, website, histogram). Filter by rating, language, date and per-place limit

Pricing

from $0.45 / 1,000 results

Rating

5.0

(2)

Developer

Muhamed Didovic

Actor stats

Bookmarked

Total users

Monthly active users

11 days ago

Last modified

TripAdvisor Reviews Scraper

Turn TripAdvisor hotel pages into structured review datasets. Pull every public review for a property — title, rating, language, text, travel date, owner response, photos, sub‑ratings — already enriched with the host place's full metadata (address, geo, ranking, ratingHistogram). One run, one clean dataset.

How it works

How TripAdvisor Reviews Scraper Works

✨ Why use this scraper?

Manually opening hotel pages and copying reviews? Stitching together separate "reviews" and "place details" scrapes? Getting blocked by DataDome the moment you scale?

🏨 Reviews + place metadata in the same row. Every review already carries placeInfo (rating, address, lat/lng, ranking, histogram). No follow‑up enrichment.
🎯 Server‑side filters wired through TripAdvisor's GraphQL. Star ratings, languages, per‑place limits and dates are pushed down to the API — you get back what you asked for.
📅 Absolute or relative date cutoff. "2026-01-01" or "22 days", "3 weeks", "6 months", "1 year" — all valid for lastReviewDate.
👤 Optional reviewer profiles. Flip scrapeReviewerInfo to switch from the lean review‑centric output to a reviewer‑centric output with username, hometown, contributions, avatar, profile link.
🧩 Three output modes. Default flat is one row per review (good for tabular consumers). outputShape: "nested" collapses each place into a single row with reviews[] nested. scrapeReviews: false skips reviews entirely and emits a place‑only snapshot — fast, low‑cost.
🛡️ Hardened anti‑bot path. Mobile‑Safari UA fallback through undici for HTML, real browser fingerprinting via ImpitHttpClient for the GraphQL endpoint, single‑shot DataDome detection.
📑 Per‑location failure dataset. Skipped or blocked hotels land in a side dataset (tripadvisor-failures) instead of getting buried in logs.
⚡ Parallel pagination. Each hotel pages through GraphQL in concurrent batches (default 3), respecting your global and per‑place caps.

🎯 Use cases

Team	What they build
Hotel ops	Daily review monitoring + owner‑response SLA tracking
Reputation managers	Multi‑property reputation dashboards with ratingHistogram drift over time
Market analysts	Competitive benchmarks across cities or chains using `placeInfo.rating` + `numberOfReviews`
Content / NLP teams	Multilingual review corpora for sentiment and topic models, filtered by language and rating
Travel media	Curated "best of" articles backed by recent verified reviews
Data teams	One‑shot dataset exports for BI, lake or warehouse ingestion (JSON, CSV, Excel)

🔧 How it works (pipeline)

Detect location ID from each *_Review-... URL (-d{id}-) — works for hotels, restaurants and attractions.
Fetch the place HTML with a mobile‑Safari User‑Agent. Falls back to a direct undici fetch when DataDome blocks the desktop fingerprint, and to URL‑derived placeInfo when even that is blocked (reviews still come through over GraphQL).
Extract placeInfo from the page's JSON‑LD + meta description (rating, review count, address, geo, ranking position).
Page through GraphQL reviews at /data/graphql/ids in concurrent batches, with reviewRatings / reviewsLanguages / maxItems pushed into the filter payload.
Apply lastReviewDate client‑side after each page, and break early once an entire page is older than the cutoff.
Map and push each review enriched with placeInfo to the default dataset; skipped or blocked places go to tripadvisor-failures.

📥 Supported inputs

Currently supported start URLs (mix and match in a single run):

Pattern	Example
`Hotel_Review-g{geoId}-d{locationId}-Reviews-{slug}.html`	`https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html`
`Restaurant_Review-g{geoId}-d{locationId}-Reviews-{slug}.html`	`https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html`
`Attraction_Review-g{geoId}-d{locationId}-Reviews-{slug}.html`	`https://www.tripadvisor.com/Attraction_Review-g60763-d105123-Reviews-Statue_of_Liberty-New_York_City_New_York.html`
`Restaurants-g{geo}-…html` GEO restaurant hub — expands to each matching restaurant	`https://www.tripadvisor.com/Restaurants-g35805-Chicago_Illinois.html`
`Hotels-g{geo}-…-Hotels.html` GEO hotel hub — expands to each matching hotel	`https://www.tripadvisor.com/Hotels-g60763-New_York_City_New_York-Hotels.html`
`Attractions-g{geo}-Activities-…html` GEO attractions hub — expands to each matching attraction (optional `-cNN-` category)	`https://www.tripadvisor.com/Attractions-g60763-Activities-New_York_City_New_York.html`
`FindRestaurants` search URL — expands to each matching restaurant	`https://www.tripadvisor.com/FindRestaurants?geo=188673&establishmentTypes=10591&mealTypes=10597&broadened=false`

Place detail URLs share the same GraphQL endpoint and review schema — only placeInfo.type differs (HOTEL, EATERY, ATTRACTION). A FindRestaurants start URL is resolved via TripAdvisor’s listing GraphQL into many Restaurant_Review venues (paginated, up to 10 × 30 = 300 venues per search by default); then each venue is scraped like a standalone restaurant URL — maxItems applies per restaurant discovered from that listing (subject to your plan’s overall cap).

Not currently supported:

Search / hub pages other than FindRestaurants (for example generic Hotels-g…, Attractions-g…).
tripadvisor.co.* country domains (use .com).
Non‑TripAdvisor hosts.

⚙️ Input parameters

Discovery (use `searchQuery` and/or `startUrls`)

Parameter	Type	Default	Description
`searchQuery`	string	—	Free-text location, e.g. `"Chicago"`, `"Brooklyn"`, `"London"`. Resolved to a TripAdvisor geoId via the typeahead GraphQL endpoint, then expanded into venues based on the include* toggles below. Use alongside or instead of `startUrls`.
`startUrls`	array of `{ url }`	`[]`	TripAdvisor `Hotel_Review`, `Restaurant_Review`, `Attraction_Review`, and `FindRestaurants?…` URLs. Listing URLs expand to discovered restaurants; plain place URLs scrape that place directly. Run in parallel up to `maxConcurrency`.
`includeRestaurants`	boolean	`true`	When `searchQuery` is set, include `Restaurant_Review` venues for the resolved geo. Wired through the existing `FindRestaurants?geo=…` expansion.
`includeHotels`	boolean	`true`	When `searchQuery` is set, include `Hotel_Review` venues. Wired through the `Hotels-g{geo}-…-Hotels.html` listing expander (paginated, up to ~300 hotels per query).
`includeThingsToDo`	boolean	`true`	When `searchQuery` is set, include `Attraction_Review` venues. Wired through the `Attractions-g{geo}-Activities-…html` listing expander (paginated, up to ~300 attractions per query).
`includeNearby`	boolean	`false`	When `true`, after each main place is scraped the actor expands up to 5 nearby venues from the page's nearby carousel as additional place-detail-only snapshot rows tagged with `isNearbyResult: true`. Reviews are NOT fetched for nearby venues. Depth capped at 1.

One of searchQuery or startUrls should be provided. Empty input produces a clean no-op run.

Filters

Parameter	Type	Default	Description
`maxItems`	integer	`50`	Max reviews to fetch per place / per URL. Applied to each entry in `startUrls` independently — e.g. `maxItems: 50` with 3 URLs ⇒ up to 150 reviews total. `0` = unlimited (paginate to the end of each place).
`reviewRatings`	array	`[]` (all)	Star ratings to keep. Values: `"1"`, `"2"`, `"3"`, `"4"`, `"5"`, or `"ALL_REVIEW_RATINGS"`. Pushed down into the GraphQL filter payload.
`reviewsLanguages`	array	`[]` (all)	ISO 639‑1 codes (e.g. `["en", "de", "fr"]`) or `"ALL_REVIEW_LANGUAGES"`. Pushed down into the GraphQL filter payload.
`lastReviewDate`	string	—	Skip reviews published before this date. Accepts an absolute date `YYYY-MM-DD` or a relative duration: `"22 days"`, `"2 weeks"`, `"3 months"`, `"1 year"` (singular or plural).
`scrapeReviewerInfo`	boolean	`true`	When `true`, the `user` object on each review is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). When `false`, `user` is `null`. The rest of the review fields (`id`, `lang`, `helpfulVotes`, `tripType`, `subratings`, `photos`, …) are emitted in both modes.

Migration note: maxItemsPerQuery was merged into maxItems — maxItems is now the per-place cap (it used to be a run-wide ceiling). Saved configs that still pass maxItemsPerQuery keep working: it's accepted as a deprecated alias and overrides maxItems for that run, with a warning logged.

Place vs reviews

Parameter	Type	Default	Description
`scrapeReviews`	boolean	`true`	When `true` (default), paginate reviews via GraphQL up to `maxItems` per place. When `false`, skip reviews entirely and emit one dataset row per start URL with `{ "placeDetailOnly": true, "placeInfo": … }` parsed from the page HTML. Best for fast place snapshots.
`includeReviewTags`	boolean	`true`	When `true` (default), include `placeInfo.reviewTags` (theme phrases like `"sushi: 14 reviews"`) on emitted rows when TripAdvisor embeds them. Set `false` to drop them for smaller payloads.
`outputShape`	string (`"flat"` \| `"nested"`)	`"flat"`	Controls the dataset shape when reviews are scraped. `"flat"` (default) keeps today's row‑per‑review layout, each row carrying `placeInfo`. `"nested"` collapses each place into a single row of the form `{ "placeDetailOnly": false, "placeInfo": …, "reviews": [...] }`. No effect when `scrapeReviews: false`.

Note on nested mode + billing: under PRICE_PER_DATASET_ITEM (the default Apify pricing model), nested mode bills once per place instead of once per review — so a place with 200 reviews charges 1 dataset item, not 200. If you switch the default, also review your actor's pricing config.

Advanced

Parameter	Type	Default	Description
`maxConcurrency`	integer	`100`	Max start URLs (hotels) processed concurrently.
`minConcurrency`	integer	`1`	Crawler floor.
`maxRequestRetries`	integer	`15` (range `0`–`50`)	Retries per request before giving up. Lower = surface failures fast, higher = absorb transient anti‑bot blocks.
`proxy`	object	`{ useApifyProxy: true, apifyProxyGroups: ["RESIDENTIAL"] }`	Standard Apify proxy block. Apify Residential is strongly recommended for TripAdvisor.

📊 Output overview

The actor emits one of three row shapes depending on scrapeReviews and outputShape. Default settings produce a flat row‑per‑review dataset enriched with the host place's placeInfo — the rest of this section walks the review schema first, then shows how nested and place‑only rows differ.

Mode	`scrapeReviews`	`outputShape`	Rows per place	Row shape
Flat reviews (default)	`true`	`"flat"`	up to `maxItems`	`{ ...review, placeInfo }` — one row per review
Nested reviews	`true`	`"nested"`	exactly `1`	`{ placeDetailOnly: false, placeInfo, reviews: [...] }`
Place snapshot	`false`	(ignored)	exactly `1`	`{ placeDetailOnly: true, placeInfo }`

The full review schema below is emitted in both scrapeReviewerInfo modes — the only difference is whether the user object is populated (true) or set to null (false).

Review schema (both modes)

{
    "id": "1003456789",
    "url": "https://www.tripadvisor.com/ShowUserReviews-g60763-d208453-r1003456789-Hilton_New_York_Times_Square-New_York_City_New_York.html",
    "title": "Great location, friendly staff",
    "lang": "en",
    "language": "en",
    "originalLanguage": "en",
    "locationId": "208453",
    "publishedDate": "2026-03-14",
    "publishedPlatform": "OTHER",
    "rating": 5,
    "helpfulVotes": 2,
    "text": "We stayed three nights and...",
    "travelDate": "2026-03",
    "stayDate": "2026-03-14",
    "tripType": "COUPLES",
    "user": null,
    "ownerResponse": {
        "id": "987654",
        "text": "Thank you for staying with us...",
        "lang": "en",
        "publishedDate": "2026-03-16",
        "responder": "Hilton Times Square",
        "connectionToSubject": "Manager"
    },
    "subratings": [
        { "name": "Service", "value": 5 },
        { "name": "Cleanliness", "value": 5 },
        { "name": "Value", "value": 4 },
        { "name": "Location", "value": 5 },
        { "name": "Rooms", "value": 4 },
        { "name": "Sleep Quality", "value": 5 }
    ],
    "photos": [
        { "id": "812340000", "width": 4032, "height": 3024, "image": "https://media-cdn.tripadvisor.com/media/photo-o/30/68/0c/00/lobby.jpg" }
    ],
    "placeInfo": {
        "id": "208453",
        "name": "Hilton New York Times Square",
        "rating": 4.0,
        "numberOfReviews": 8944,
        "locationString": "New York City, New York",
        "latitude": 40.756,
        "longitude": -73.989,
        "webUrl": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html",
        "website": "https://www.hilton.com/...",
        "path": "/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html",
        "phone": "+12125551234",
        "address": "234 W 42nd St, New York City, NY 10036",
        "addressObj": {
            "street1": "234 W 42nd St",
            "city": "New York City",
            "state": "NY",
            "postalcode": "10036",
            "country": "United States"
        },
        "ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }
    }
}

With scrapeReviewerInfo: true the only field that changes is user — it is replaced with the reviewer's profile object:

{
    "user": {
        "userId": "ABCDEF12345",
        "name": "Traveler 123",
        "contributions": { "totalContributions": 42 },
        "username": "traveler123",
        "userLocation": { "shortName": "Berlin", "name": "Berlin, Germany", "id": "187323" },
        "avatar": { "id": "5567890", "width": 200, "height": 200, "image": "https://media-cdn.tripadvisor.com/media/photo-l/...jpg" },
        "link": "www.tripadvisor.com/Profile/traveler123"
    }
}

🗂️ Output fields

Review core

All fields below are emitted in both scrapeReviewerInfo modes.

Field	Type	Description
`id`	string	TripAdvisor review ID
`url`	string	Permalink to the review on TripAdvisor
`title`	string	Review headline
`text`	string	Review body
`rating`	integer (1–5)	Stars left by the reviewer
`lang`	string (ISO 639‑1)	Source language (preserves `originalLanguage` for machine‑translated reviews)
`language`	string (ISO 639‑1)	Current API language (may be the translated language)
`originalLanguage`	string (ISO 639‑1)	Source language before machine translation
`publishedDate`	string (ISO date)	When the review was published
`publishedPlatform`	string	Source platform (e.g. `OTHER`, `MOBILE`)
`helpfulVotes`	integer	Helpful‑vote count
`travelDate`	string (`YYYY-MM`)	Month/year of the stay
`stayDate`	string (`YYYY-MM-DD`)	Full stay date
`tripType`	string \| null	`COUPLES`, `FAMILY`, `BUSINESS`, `SOLO`, `FRIENDS`, …
`locationId`	string	TripAdvisor location ID for the place

Owner response

Field	Type	Description
`ownerResponse.id`	string	Response ID
`ownerResponse.text`	string	Response body
`ownerResponse.lang`	string	Response language
`ownerResponse.publishedDate`	string	Response date
`ownerResponse.responder`	string	Display name (e.g. property name or manager)
`ownerResponse.connectionToSubject`	string	`Manager`, `Owner`, etc.

Subratings

subratings is an array of { name, value }:

Field	Type	Description
`subratings[].name`	string	`Service`, `Cleanliness`, `Value`, `Location`, `Rooms`, `Sleep Quality`, …
`subratings[].value`	integer (1–5)	Per‑aspect rating

Photos

photos is an array of { id, width, height, image }:

Field	Type	Description
`photos[].id`	string	TripAdvisor photo ID
`photos[].width` / `height`	integer	Native pixel dimensions
`photos[].image`	string	Bare CDN URL (no `?w=...` query)

Reviewer (only when `scrapeReviewerInfo: true`)

Field	Type	Description
`user.userId`	string \| null	Internal TripAdvisor user ID
`user.name`	string \| null	Display name
`user.username`	string \| null	URL handle
`user.contributions.totalContributions`	integer	Lifetime contribution count
`user.userLocation`	object \| null	`{ shortName, name, id }` resolved from hometown
`user.avatar`	object \| null	`{ id, width, height, image }`
`user.link`	string \| null	Profile link without scheme (`www.tripadvisor.com/Profile/...`)

Place metadata (`placeInfo`)

Field	Type	Description
`placeInfo.id`	string	Numeric location ID
`placeInfo.name`	string	Place display name (hotel, restaurant, attraction)
`placeInfo.rating`	number	Aggregate 1–5 rating
`placeInfo.numberOfReviews`	integer	Total review count on TripAdvisor
`placeInfo.locationString`	string	Human‑readable city/state/country
`placeInfo.latitude` / `longitude`	number	Geo coordinates
`placeInfo.address`	string	Single‑line address
`placeInfo.phone`	string \| undefined	Telephone when TA exposes it
`placeInfo.path`	string \| undefined	Relative TripAdvisor path (e.g. `/Restaurant_Review-g…-Reviews-…`) — often set from FindRestaurants listing GraphQL
`placeInfo.webUrl`	string	Full `https://www.tripadvisor.com/…` URL for the place
`placeInfo.cuisines`	string[] \| undefined	Cuisine labels when present (listing or HTML)
`placeInfo.priceLevel`	string \| undefined	Price band, e.g. `$$ - $$$`
`placeInfo.menuUrl`	string \| undefined	External menu / pub chain URL when present on the listing
`placeInfo.addressObj`	object	`{ street1, street2, city, state, postalcode, country }`
`placeInfo.website`	string	Official business website (when known)
`placeInfo.ratingHistogram`	object	`{ count1, count2, count3, count4, count5 }`
`placeInfo.type`	string	`HOTEL`, `EATERY`, `ATTRACTION`, …

When you start from a FindRestaurants?… URL, the actor first calls TripAdvisor’s listing GraphQL and seeds each review row’s placeInfo with whatever the listing card returns (name, relative path, lat/lon, cuisines, price band, review count, address, phone, menu URL). The detail HTML pass (when not blocked) can still refine or override overlapping fields.

Alternative output shapes

Set outputShape: "nested" to collapse a place's reviews into a single row:

{
    "placeDetailOnly": false,
    "placeInfo": {
        "id": "208453",
        "name": "Hilton New York Times Square",
        "rating": 4.0,
        "numberOfReviews": 8944,
        "address": "234 W 42nd St, New York City, NY 10036",
        "ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 }
    },
    "reviews": [
        { "id": "1003456789", "rating": 5, "title": "Great location, friendly staff", "text": "We stayed three nights...", "publishedDate": "2026-03-14", "lang": "en", "...": "..." },
        { "id": "1003456790", "rating": 4, "title": "Solid choice for Times Square", "text": "Rooms were quieter than expected...", "publishedDate": "2026-03-13", "lang": "en", "...": "..." }
    ]
}

Set scrapeReviews: false (regardless of outputShape) to skip reviews entirely and get a fast place snapshot. The snapshot row carries the full parsed placeInfo — description, amenities, room tips, neighborhood + ancestor + metro context, hotel class, offers, category review scores, cuisines + hours (restaurants), and more — pulled from JSON-LD and the page's embedded redux/apollo state:

{
    "placeDetailOnly": true,
    "placeInfo": {
        "id": "208453",
        "name": "Hilton New York Times Square",
        "description": "Wake Up to the Best Views in Times Square. Perched above the energy and heart of the city...",
        "rating": 4.0,
        "rawRanking": 3.898643732070923,
        "rankingPosition": 289,
        "rankingDenominator": "520",
        "rankingString": "#289 of 520 hotels in New York City",
        "rankingSource": "HTML",
        "numberOfReviews": 8944,
        "hotelClass": "4.0",
        "hotelClassAttribution": "Classified by Giata.",
        "amenities": ["Wifi", "Pool", "Fitness center", "..."],
        "categoryReviewScores": [
            { "categoryName": "Service", "score": 4.36, "roundedScore": 4.4 },
            { "categoryName": "Cleanliness", "score": 4.58, "roundedScore": 4.6 }
        ],
        "neighborhoodLocations": [
            { "id": "15565670", "name": "Times Square" },
            { "id": "7102352", "name": "Midtown" }
        ],
        "ancestorLocations": [
            { "id": "60763", "name": "New York City", "subcategory": "City" },
            { "id": "28953", "name": "New York", "abbreviation": "NY", "subcategory": "State" }
        ],
        "nearestMetroStations": [
            { "name": "42nd St – Port Authority", "distance": 0.08, "lines": [{ "lineName": "A" }, { "lineName": "C" }] }
        ],
        "offers": [{ "pricePerNight": 169, "vendor": "Booking.com" }],
        "roomTips": [{ "id": "1055595676", "text": "Perfect Location in the Heart of Times Square." }],
        "address": "234 W 42nd St, New York City, NY 10036",
        "ratingHistogram": { "count1": 412, "count2": 480, "count3": 1100, "count4": 2680, "count5": 4272 },
        "reviewTags": [ { "text": "times square", "reviews": 1988 } ]
    }
}

Restaurant snapshots additionally surface cuisines, hours, dishes, mealTypes, dietaryRestrictions, features, menuWebUrl, establishmentTypes, and the open/closed flags (isClosed, isLongClosed, openNowText).

The same rich placeInfo is emitted at the top level when you use outputShape: "nested" — the nested reviews[] underneath carry the slim variant to avoid duplicating tens of fields per review.

Example placeInfo shaped like a listing card (values match what TripAdvisor often returns for a London pub — your dataset may use string id):

{
    "id": "944622",
    "name": "The Yacht",
    "path": "/Restaurant_Review-g186338-d944622-Reviews-The_Yacht-London_England.html",
    "webUrl": "https://www.tripadvisor.com/Restaurant_Review-g186338-d944622-Reviews-The_Yacht-London_England.html",
    "latitude": 51.48476,
    "longitude": -0.003814,
    "menuUrl": "https://www.greeneking.co.uk/pubs/greater-london/yacht/menu?utm_source=exnet&utm_medium=locations&utm_campaign=UC_menu",
    "cuisines": ["Bar", "British", "Pub"],
    "priceLevel": "$$ - $$$",
    "rating": 4,
    "numberOfReviews": 401,
    "address": "5 Crane St Greenwich, London SE10 9NP England",
    "phone": "+44 20 8858 0175",
    "type": "EATERY"
}

🚀 Examples

Single hotel, defaults

{
    "startUrls": [
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
    ],
    "maxItems": 50
}

Multiple hotels, English‑only, last 30 days

{
    "startUrls": [
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d93589-Reviews-The_Manhattan_at_Times_Square_Hotel-New_York_City_New_York.html" }
    ],
    "maxItems": 100,
    "reviewsLanguages": ["en"],
    "lastReviewDate": "30 days"
}

Recent 3‑star reviews with full reviewer profiles

{
    "startUrls": [
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
    ],
    "maxItems": 50,
    "reviewRatings": ["3"],
    "reviewsLanguages": ["en"],
    "lastReviewDate": "2026-01-01",
    "scrapeReviewerInfo": true
}

Mix hotels and restaurants in one run

{
    "startUrls": [
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
        { "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }
    ],
    "maxItems": 50,
    "reviewsLanguages": ["en"]
}

Place snapshots only (no reviews)

Skip review pagination entirely — useful for bulk place metadata:

{
    "startUrls": [
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" },
        { "url": "https://www.tripadvisor.com/Restaurant_Review-g60763-d25324283-Reviews-Allora_Fifth_Ave-New_York_City_New_York.html" }
    ],
    "scrapeReviews": false
}

Each start URL produces exactly one dataset row of { "placeDetailOnly": true, "placeInfo": … }.

Nested output (one row per place, reviews under `reviews[]`)

{
    "startUrls": [
        { "url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html" }
    ],
    "maxItems": 200,
    "outputShape": "nested",
    "reviewsLanguages": ["en"]
}

The dataset will contain exactly one row per place with reviews nested under it — easier to consume when you want a single object per hotel/restaurant rather than joining N review rows back to a place.

Free-text search query

Resolve a city or neighborhood name to a geoId without pasting a URL — combines naturally with the include* toggles:

{
    "searchQuery": "Chicago",
    "includeRestaurants": true,
    "includeHotels": false,
    "includeThingsToDo": false,
    "maxItems": 30,
    "reviewsLanguages": ["en"]
}

The actor calls TripAdvisor's typeahead GraphQL, picks the first GEO/NEIGHBORHOOD result, and seeds the run with FindRestaurants?geo=<id> (when includeRestaurants), Hotels-g<id>-Hotels.html (when includeHotels), and/or Attractions-g<id>-Activities.html (when includeThingsToDo). All three listing types share the same -oa30- HTML pagination pipeline.

`FindRestaurants` search (listing → many restaurants)

{
    "startUrls": [
        {
            "url": "https://www.tripadvisor.com/FindRestaurants?geo=188673&establishmentTypes=10591&mealTypes=10597&broadened=false"
        }
    ],
    "maxItems": 50,
    "reviewsLanguages": ["en"]
}

Each review row carries placeInfo seeded from the listing card (coordinates, cuisines, priceLevel, menuUrl, relative path, etc., when TripAdvisor returns them).

💻 Integrations

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("muhamed-didovic/apify-tripadvisor").call(run_input={
    "startUrls": [
        {"url": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html"},
    ],
    "maxItems": 50,
    "reviewsLanguages": ["en"],
    "lastReviewDate": "30 days",
})

for review in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{review['rating']}★  {review['title']}  ({review['lang']})")
    print(f"   place: {review['placeInfo']['name']}  rating: {review['placeInfo']['rating']}")

JavaScript / Node

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('muhamed-didovic/apify-tripadvisor').call({
    startUrls: [
        { url: 'https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html' },
    ],
    maxItems: 50,
    scrapeReviewerInfo: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((r) => {
    console.log(`${r.rating}★  ${r.title}  by ${r.user?.username ?? 'anon'}`);
});

📈 Performance

Metric	Value
Reviews per hotel page (GraphQL)	10
Concurrent review pages per hotel	up to 3 (auto‑capped to 1 when many hotels run in parallel)
Anti‑bot path	Mobile‑Safari UA + `ImpitHttpClient` GraphQL fingerprint, single‑shot DataDome detection
Failure visibility	Per‑location rows in `tripadvisor-failures` named dataset
Output formats	JSON, CSV, Excel, RSS, XML, HTML (via Apify dataset views)

💡 Tips for best results

Start with one place and maxItems: 10 to validate the schema. Works the same for hotels, restaurants and attractions — confirm the fields you care about land in the dataset before launching big runs.
Use server‑side filters. reviewRatings and reviewsLanguages are pushed into the GraphQL payload — much cheaper than fetching everything and filtering downstream.
Prefer relative dates over absolute when running on a schedule. "30 days" always means "last 30 days from now"; an absolute 2026-01-01 will silently widen as time passes.
Lower maxRequestRetries for fast feedback. The default 15 absorbs anti‑bot blocks but can hide a misconfigured proxy. Try 3 while iterating, raise it for production.
Toggle scrapeReviewerInfo only when you need profiles. All review fields are kept in both modes; the only thing that changes is whether user is populated. Disabling it makes the dataset smaller and the run faster.
Check the tripadvisor-failures dataset after each run. Empty = perfect run. Rows = location‑level issues you can replay.

❓ FAQ

Which TripAdvisor URLs are supported? Hotel_Review-…, Restaurant_Review-…, and Attraction_Review-… URLs on www.tripadvisor.com (anything that follows the …-g{geoId}-d{locationId}-Reviews-… slug pattern). You can also pass a FindRestaurants?geo=…&… search URL from the site’s restaurant finder — the actor expands it to individual restaurant places and scrapes reviews per venue. They share the same review GraphQL endpoint; only placeInfo.type differs. Other search hub pages (e.g. generic Hotels-g… lists) are not supported. Use www.tripadvisor.com — not tripadvisor.co.* mirrors.

How many reviews can I scrape per run? As many as your Apify plan and maxItems allow. maxItems is per place (per Hotel_Review / Restaurant_Review / Attraction_Review URL, and per restaurant after a FindRestaurants listing is expanded). Rough upper bound: maxItems × total_places — where total_places counts every expanded restaurant from searches plus every direct detail URL (startUrls). Runs are still subject to platform limits (e.g. free-tier item caps). The actor paginates reviews in batches of 20 per GraphQL page, in parallel where possible.

What does scrapeReviewerInfo actually change? The only field that changes is user. With true (default) the user object is populated with the reviewer's profile (username, display name, avatar, hometown, contributions, profile link). With false it's null. All other review fields (id, lang, subratings, photos, helpfulVotes, tripType, …) are emitted in both modes.

Do I need to know the URL to scrape a city? No. Pass searchQuery: "Chicago" (or any city/neighborhood name) and the actor resolves it to a geoId via TripAdvisor's typeahead, then expands into venues based on the includeRestaurants / includeHotels / includeThingsToDo toggles. Today only includeRestaurants is wired through the listing expander; for hotels and attractions, paste the corresponding Hotel_Review-… / Attraction_Review-… URLs into startUrls directly. You can mix searchQuery and startUrls in the same run.

What's the difference between scrapeReviews: false and outputShape: "nested"? scrapeReviews: false skips all review pagination — you only get the place metadata (one fast row per URL with placeDetailOnly: true). outputShape: "nested" still scrapes reviews but groups them under a single dataset row per place ({ placeDetailOnly: false, placeInfo, reviews: [...] }). Pick false for cheap bulk place snapshots; pick nested when you want reviews + a place‑centric layout for downstream consumers.

What date formats does lastReviewDate accept? Both absolute (2026-01-01, ISO YYYY-MM-DD) and relative durations: "22 days", "3 weeks", "6 months", "1 year" (singular and plural both work). Reviews older than the resolved date are dropped client‑side, with an early‑break optimisation that stops paginating once an entire page is older than the cutoff.

Why is lang sometimes different from the on‑site language toggle? TripAdvisor machine‑translates reviews. We always emit the source language (originalLanguage), not the displayed translation, so analytics groupings stay accurate.

Can I scrape multiple places in one run? Yes — pass any number of startUrls. Hotels, restaurants and attractions can be mixed in the same run; they're processed concurrently up to maxConcurrency. Intra‑place pagination is auto‑capped to 1 in that case so the GraphQL endpoint isn't hammered with concurrency × pages requests.

Where do failed locations end up? In a named dataset called tripadvisor-failures with rows like { url, locationId, locationType, reason, message, timestamp }. The reason field is one of invalid-location-id, html-blocked, datadome-block, reviews-fetch-failed, no-reviews-saved. The Apify run page surfaces a direct link.

Is private content accessible? No. The actor only sees public TripAdvisor pages and the public review GraphQL endpoint.

📬 Support

Bug reports / feature requests: use the Issues tab on the actor page.
Direct contact: muhamed.didovic@gmail.com
Author's website: muhamed-didovic.github.io

🛠️ Additional services

Custom output schemas, one‑off datasets, scheduled exports.
Other platforms (Booking, Yelp, Google Maps, etc.) on request.
API integrations and automation pipelines.

Email: muhamed.didovic@gmail.com.

🔭 Explore more scrapers

If this actor was useful, browse other scrapers from memo23 on Apify.

⚖️ Legal & compliance

This scraper targets publicly accessible TripAdvisor content for legitimate research, monitoring, and analytics. You are responsible for:

Complying with TripAdvisor's terms of service.
Respecting robots.txt and reasonable rate limits.
Using scraped data lawfully (privacy, copyright, GDPR/CCPA where applicable).
Obtaining any necessary permissions for commercial reuse of the data.

Tripadvisor Reviews Scraper

maxcopell/tripadvisor-reviews

Get and download reviews for chosen places on Tripadvisor. Extract the review text, URL, rating, date of travel, published date, basic reviewer info, owner's response, helpful votes, images, review language, place details. Download reviews in XML, JSON, CSV.

Max

8.4K

4.9

🐺 TripAdvisor Reviews Scraper API | $0.50/1K Reviews

thewolves/tripadvisor-reviews-scraper

The Wolves proudly presents TripAdvisor Review Scraper, the perfect solution for TripAdvisor review extraction. Incredibly, it retrieves 100-200 reviews per second at an amazing cost-effective rate of $0.50 per 1000 reviews. Get any data from TripAdvisor by targeting. Cheapest!!

The Wolves

741

5.0

Instacart Grocery Price Index

shahidirfan/Instacart-Grocery-Price-Index

Track real-time grocery price trends across Instacart. Monitor food costs, analyze pricing patterns, and gain insights into market fluctuations. Perfect for price analysis and competitive research.

Shahid Irfan

5.0

Crexi Property & Broker Data Scraper Pro

ahmed_jasarevic/crexi-property-broker-data-scraper-pro

Fastest Crexi scraper on the market. Extracts 15+ data points including prices, broker info, property types, and hidden metadata directly from JSON. Supports residential proxies and session cookies to bypass Cloudflare. Optimized for low RAM usage.

Ahmed Jasarevic

5.0

YouTube Transcript Extractor

scraperhive/youtube-transcript-extractor

Extract YouTube video transcripts, subtitles, and captions in multiple formats with precise timestamps. Plain Text · JSON · SRT · WebVTT · 20+ Languages · Batch Processing · Auto + Manual Captions

Mubeen Ali

5.0

Pages Jaunes Scraper

data2b/pages-jaunes-scraper

Pages Jaunes Scraper — French Business Leads at Scale Stop copying contacts by hand. Extract thousands of French business leads from [pagesjaunes.fr] in minutes — names, phones, addresses, ratings — ready for your CRM, outreach, or analysis. No coding required.

DATA2B

SBA Crawler

jungle_synthesizer/sba-crawler

Scrape 450k+ verified business listings from the SBA Dynamic Small Business Search (DSBS) database. Includes contact details, certifications, NAICS codes, and addresses across all 50 U.S. states.

BowTiedRaccoon

Social Media Lead Finder – Find Interested People in Real-Time

lofomachines/fresh-lead-finder

Find up to 300 people actively looking for your service or products now — across Reddit and Facebook Groups in 20+ countries.

Lofomachines

5.0

10x Cheaper Google Maps Scraper

placesdata/google-maps-scraper-fast-cheap

$0,4/1k results. Get Google Maps business data — phone, website, rating, address, GPS, Maps URL — for any niche and location. No Cookies Required. Results in seconds. Automatically deduplicates results and resolves location. Ideal for building lead lists for web design, local SEO, and cold outreach.

Ondra Lipovsky

4.7

Extract contacts & social links from any site

lofomachines/contact-extractor

Paste one or more website URLs and get every email address, phone number, and social media profile delivered in a clean, ready-to-use spreadsheet.