Pricing

$19.99/month + usage

Tripadvisor Review Scraper With Keyword Search

🧭 Tripadvisor Review Scraper automates collection of reviews, ratings, dates, helpful votes & reviewer profiles from hotels, restaurants and attractions. 🔎 Pagination & filters included. 📊 Export CSV/JSON for sentiment, reputation tracking & competitor analysis. ⚡

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapAPI

Actor stats

Bookmarked

Total users

Monthly active users

a day ago

Last modified

What does this TripAdvisor review scraper do?

Given one or more hotel URLs, hotel names, or search terms, the actor:

Resolves each input to a real TripAdvisor hotel page (direct URL is fastest; a name or search term is auto-resolved via Google Search).
Pulls the hotel's own metadata — name, address, geo-coordinates, overall rating, rating histogram, property-level aspect scorecard, and current live ranking — from the hotel page.
Fetches reviews through TripAdvisor's internal review-list API, with full auto-pagination up to your requested count (no manual "next page" clicking, no scrolling).
Optionally runs a site-side keyword search: each keyword you supply is sent to TripAdvisor's own review-search index (not a local text grep), in every language you specify, and the actor reports TripAdvisor's own match count per keyword alongside the matching reviews.
Exports everything as structured JSON/CSV/Excel/XML/HTML, ready to plug into a spreadsheet, BI dashboard, or downstream NLP/sentiment pipeline.

Why keyword search matters for TripAdvisor reviews

Most TripAdvisor review scrapers hand back the newest (or highest-rated) N reviews and machine-translate everything to English by default — which hides two things a reputation-monitoring or due-diligence use case actually needs:

Whether a specific complaint exists at all, and how often — without you reading thousands of rows yourself.
What guests who wrote in their own language actually said, since an English-only keyword search misses reviews originally written in Spanish, Indonesian, Portuguese, etc.

This actor asks TripAdvisor's own search index directly, across the language(s) you choose, and surfaces the real match count TripAdvisor itself reports — instead of guessing from a sample of the newest reviews.

Key features

🔎 Keyword-in-review search — run your own list of concern words/phrases as site-side searches against TripAdvisor's review index, not a local grep over already-fetched reviews.
🌐 Keyword × language cross-product — pair each keyword with the ISO language code(s) it should be searched in, so a local-language complaint isn't invisible to an English-only sweep.
📊 TripAdvisor's own match count — keywordMatchCounts reports the site's real per-keyword, per-language hit count (not an estimate), so you know how common a concern is before pulling every matching row.
📝 Original wording, not silent auto-translation — toggle off machine translation to get the guest's real text and language, plus flags (originalLanguage, translationType, isMachineTranslated) so every row tells you whether it was translated.
🧳 Traveler-type filter — narrow results to Business, Couples, Family, Friends, or Solo trips.
🏅 Hotel aspect scorecard — the property-level Location / Rooms / Value / Cleanliness / Service / Sleep Quality scores TripAdvisor shows on the hotel page itself (not an average of the sampled reviews).
🏆 Live hotel ranking — the hotel's current position/out-of/category/area ranking as TripAdvisor displays it right now.
⭐ Star-rating filter, sort order (newest/oldest/most relevant/highest rating), and a display-language filter for the plain (non-keyword) feed.
👤 Optional full reviewer profile (name, username, location, avatar, contribution counts) per review.
📥 Bulk input — multiple hotel URLs/names in one run; auto-pagination up to 10,000 reviews per hotel.
📦 Standard Apify export formats: JSON, CSV, Excel, XML, HTML.

How to use it

Open the actor and paste one or more TripAdvisor hotel URLs into Hotel URLs, Names, or Search Terms (a hotel name or loose search term also works — it's auto-resolved).
Leave Concern Keywords to Search For at its default list, or replace it with your own words/phrases — or clear it entirely to turn keyword search off and get the plain newest/oldest/relevant/rating feed instead.
Add the ISO language code(s) you want each keyword searched in under Languages to Search In.
Set Max Reviews to Return per Hotel, pick a Result Order, and optionally narrow by Star Rating Filter or Traveler Type Filter.
Click Start. Results stream into the dataset as they're found — export as JSON, CSV, Excel, XML, or HTML when the run finishes.

Input

Field	Type	Description
`startUrls`	array (required)	TripAdvisor hotel URLs, hotel names, or search terms.
`reviewKeywords`	array	Words/phrases to search for inside reviews via TripAdvisor's own search index. Empty = keyword search off (default: `["dirty","noisy","rude","smell","broken","bed bugs"]`).
`keywordSearchLanguages`	array	ISO language codes each keyword is searched in (default: `["en"]`). Runs the full keyword × language cross-product.
`keepOriginalReviewText`	boolean	ON (default) = guest's real wording/language, no machine translation. OFF = TripAdvisor machine-translates every review to English.
`maxComments`	integer (1–10,000)	Reviews to return per hotel (or unique keyword-matching reviews, when keyword search is on). Default `10`.
`sortOrder`	enum	`newest` (default) / `oldest` / `relevant` / `rating` — order for the plain feed; keyword-search results are ordered by which sweep found them.
`reviewsLanguages`	enum	Display-language filter for the plain feed (default `English`; `ALL_REVIEW_LANGUAGES` keeps every locale).
`reviewRatings`	enum	`ALL_REVIEW_RATINGS` (default) / `POSITIVE` / `NEGATIVE` / `AVERAGE` / a specific star count 1–5.
`travelerTypes`	array	Keep only `BUSINESS`/`COUPLES`/`FAMILY`/`FRIENDS`/`SOLO` reviews. Empty (default) = keep all.
`scrapeReviewerInfo`	boolean	Include the full reviewer profile per review (default `true`).
`proxyConfiguration`	object	Apify Proxy config. Defaults to Residential — TripAdvisor is protected by DataDome and datacenter IPs get blocked quickly.

Example input

{
  "startUrls": [
    "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html"
  ],
  "reviewKeywords": ["bed bugs", "noisy", "rude"],
  "keywordSearchLanguages": ["en", "es"],
  "keepOriginalReviewText": true,
  "maxComments": 50,
  "travelerTypes": ["FAMILY", "COUPLES"],
  "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }
}

Output

Every dataset row is a single review. Base review/hotel fields are always present (whether or not keyword search is used); the keyword-search and analytics fields below are added on top.

Example output row

{
  "id": "1040504451",
  "url": "https://www.tripadvisor.com/ShowUserReviews-g60763-d208453-r1040504451-Hilton_New_York_Times_Square-New_York_City_New_York.html",
  "title": "Perfect Holidays at Hilton Times Square !!!!!",
  "lang": "en",
  "locationId": "208453",
  "publishedDate": "2025-11-27",
  "publishedPlatform": "OTHER",
  "rating": 5,
  "helpfulVotes": 0,
  "text": "We were for 2 weeks holidays in New York!! ...",
  "roomTip": null,
  "travelDate": "2025-11",
  "tripType": "FAMILY",
  "user": {
    "userId": "4381D233A5C57ADAF67693B272BEFE70",
    "name": "Dimitris T",
    "contributions": { "totalContributions": 2, "helpfulVotes": 0 },
    "username": "margaretmN8866NJ",
    "userLocation": "Thessaloniki, Greece",
    "avatar": "https://dynamic-media-cdn.tripadvisor.com/media/photo-o/1a/f6/de/5a/default-avatar-2020-36.jpg?w=100&h=100&s=1",
    "link": "www.tripadvisor.com/Profile/margaretmN8866NJ"
  },
  "ownerResponse": null,
  "subratings": [
    { "name": "Value", "value": 5 },
    { "name": "Rooms", "value": 5 },
    { "name": "Location", "value": 5 },
    { "name": "Cleanliness", "value": 5 },
    { "name": "Service", "value": 5 },
    { "name": "Sleep Quality", "value": 5 }
  ],
  "photos": [],
  "placeInfo": {
    "id": "208453",
    "name": "Hilton New York Times Square",
    "rating": 4.3,
    "numberOfReviews": 7879,
    "locationString": "New York City, New York",
    "latitude": 40.75665,
    "longitude": -73.988815,
    "webUrl": "https://www.tripadvisor.com/Hotel_Review-g60763-d208453-Reviews-Hilton_New_York_Times_Square-New_York_City_New_York.html",
    "website": "https://www.hilton.com/en/hotels/nyctshh-hilton-times-square/",
    "address": "234 West 42nd Street, New York City, NY 10036",
    "addressObj": { "street1": "234 West 42nd Street", "street2": "", "city": "New York City", "state": "NY", "country": "United States", "postalcode": "10036" },
    "ratingHistogram": { "count1": 267, "count2": 290, "count3": 704, "count4": 2568, "count5": 5064 }
  },
  "originalLanguage": "en",
  "translationType": null,
  "isMachineTranslated": false,
  "matchedKeywords": ["bed bugs"],
  "matchedKeywordLanguage": "en",
  "keywordMatchCounts": { "bed bugs": { "en": "20" } },
  "reviewsInSearchLanguage": { "en": 7970 },
  "matchedTravelerTypeFilter": true,
  "placeSubratings": { "location": 4.8, "rooms": 4.5, "value": 4.1, "cleanliness": 4.6, "service": 4.4, "sleepQuality": 4.6 },
  "placeRanking": { "position": 285, "outOf": 525, "category": "hotels", "area": "New York City" }
}

Output fields

Base review & hotel fields (present on every row, keyword search on or off): id, url, title, lang, locationId, publishedDate, publishedPlatform, rating, helpfulVotes, text, roomTip, travelDate, tripType, user (profile object, null if scrapeReviewerInfo is off), ownerResponse (management reply, null if none), subratings (per-review category scores), photos, placeInfo (hotel name/address/geo/rating/rating histogram).

Keyword search & analytics fields (new in this variant, added alongside the base fields):

Field	Description
`matchedKeywords`	Which keyword(s) from `reviewKeywords` produced this row. Empty array when keyword search is off.
`matchedKeywordLanguage`	Which `keywordSearchLanguages` value the matching sweep used.
`keywordMatchCounts`	TripAdvisor's own per-keyword, per-language match count for the run, e.g. `{"bed bugs": {"en": "20"}}`. Reported as the string `">=1000"` once a count hits the endpoint's reporting ceiling.
`reviewsInSearchLanguage`	Approximate pool size of reviews available in each requested search language — an availability figure, not a precise market-size count.
`originalLanguage`	The language the review was actually written in (before any translation).
`translationType`	`"MACHINE"` if TripAdvisor machine-translated the review, otherwise `null`.
`isMachineTranslated`	Convenience boolean, `true` when `translationType == "MACHINE"`.
`matchedTravelerTypeFilter`	Whether the row matches the `travelerTypes` filter. Always `true` when the filter is left empty.
`placeSubratings`	The hotel's own property-level aspect scorecard: `location`, `rooms`, `value`, `cleanliness`, `service`, `sleepQuality`.
`placeRanking`	The hotel's current live ranking: `position`, `outOf`, `category`, `area`.

Known limitations & honest caveats

placeSubratings/placeRanking are parsed from the hotel page's minified CSS class structure, not from a stable API. They will silently return null for every field if TripAdvisor next redesigns that page — unlike the GraphQL-based review fields, which re-derive their query ID every run and are more resilient to change.
travelerTypes is applied client-side, not via a server-side filter — TripAdvisor's own traveler-segment filter axis does not currently filter results server-side. A narrow traveler-type filter combined with a high maxComments may need to page well past your requested count before enough matching reviews are found, which adds run time.
More keywords × more languages = more requests. Each keyword/language pair is its own search sweep against TripAdvisor; a long reviewKeywords list combined with several keywordSearchLanguages will materially increase run duration and usage compared to the plain feed.
keywordMatchCounts caps out at ">=1000" for very common words — the endpoint TripAdvisor exposes does not report an exact count above that ceiling.
publishedPlatform currently reports the constant "OTHER", and user.contributions.helpfulVotes currently reports 0 on every row — these come directly from TripAdvisor's own response fields being empty/absent for this endpoint, not from this actor. They're preserved as-is for compatibility with the plain review feed.
Anti-bot exposure on Apify's own proxy pool is not independently guaranteed. TripAdvisor runs DataDome; a residential proxy is used by default and is strongly recommended, but bot-detection behavior can vary by IP and over time.
oldest/newest sort order reflects TripAdvisor's own sort key, which on a small fraction of reviews can differ slightly from the displayed publishedDate column (TripAdvisor's internal date-of-stay vs. publish-date can diverge on some rows) — worth a spot-check if strict chronological order matters for your use case.

Use cases

Reputation & complaint monitoring — find out fast whether "bed bugs", "mold", "rude staff", or any concern you define shows up in a property's reviews, and how often, without reading every review yourself.
Hotel due diligence / acquisition research — pull a property's aspect scorecard, live ranking, and rating histogram alongside the raw review text.
Multi-language brand monitoring — search the same keyword across several languages to catch complaints that an English-only sweep would miss.
Competitor benchmarking — compare match counts and aspect scores for the same concern across several hotels.
Sentiment/NLP pipelines — export original (non-machine-translated) review text and language flags into a downstream analysis tool.
Segment-specific research — filter to Business, Family, Couples, Friends, or Solo travelers before analyzing complaint patterns.

Is it legal to scrape TripAdvisor reviews?

This actor accesses only publicly available TripAdvisor pages and endpoints — the same review data any visitor to the site can see. Use the collected data responsibly: for research, analytics, and reputation-monitoring purposes, respecting TripAdvisor's terms of service and applicable data-protection regulations (GDPR, CCPA) for any personal data in reviewer profiles.

FAQ

Does this replace the plain TripAdvisor review scraper? Yes — every base input and output field (URLs, review count, sort order, ratings, reviewer profiles, hotel info) still works exactly as before. Leave reviewKeywords empty to get the plain newest/oldest/relevant/rating feed with no keyword search involved.

Does keyword search grep the reviews locally, or search TripAdvisor's own index? It sends each keyword to TripAdvisor's own review-search index and reports TripAdvisor's own match count — it is not a local text search over already-downloaded reviews. Matching is stemmed/token-based (e.g. "bedbugs" can also match "bug bites"), which is why matchedKeywords tells you which sweep produced each row rather than asserting a literal substring match.

What happens if I turn off machine translation? text becomes the guest's original wording and lang reflects the language they actually wrote in, instead of TripAdvisor's default English machine-translation. originalLanguage, translationType, and isMachineTranslated are populated either way, so you always know whether a row was translated.

What export formats are supported? JSON, CSV, Excel, XML, and HTML — Apify's standard dataset export formats.

Do I need a proxy or login? No login is required. A residential proxy is used by default and strongly recommended — TripAdvisor blocks datacenter IPs quickly.

Tripadvisor Scraper - Hotels, Restaurants & Reviews

nourishing_courier/tripadvisor-scraper

Scrape Tripadvisor hotels, restaurants, and attractions with full reviews. Extract names, ratings, review text and dates, reviewer profiles, prices, amenities, addresses, and photos. Great for market research and reputation monitoring. Export to JSON, CSV, or Excel.

Ani Björkström

Tripadvisor Scraper - Hotels, Restaurants & Reviews

dataharvest/tripadvisor-scraper

Scrape hotels, restaurants and attractions from Tripadvisor including reviews, ratings and amenities.

Alex v

TripAdvisor Scraper - Hotels, Restaurants & Attractions

rupom888/tripadvisor-scraper

Syed Rupom

TripAdvisor Scraper - Places & Ratings

automatyk/tripadvisor-scraper

Scrape TripAdvisor hotels, restaurants and attractions: name, rating, reviews count, address and details. Export to JSON, CSV, Excel.

Automatyk

Review Scraper for Tripadvisor

scrapeengine/tripadvisor-review-scraper

ScrapeEngine

Tripadvisor Review Scraper

scrapelabsapi/tripadvisor-review-scraper

ScrapeLabs

Tripadvisor Reviews Scraper

crawlerbros/tripadvisor-reviews-scraper

Extract reviews from TripAdvisor places. Get review text, ratings, dates, reviewer info, owner responses, helpful votes, and place details.

Crawler Bros

Tripadvisor Reviews Scraper

agenscrape/tripadvisor-reviews-scraper

Extract reviews, ratings, and user feedback from any TripAdvisor place. Fast and reliable scraper for restaurants, hotels, and attractions.

Agenscrape

TripAdvisor List Scraper

moving_beacon-owner1/tripadvisor-list-scraper

TripAdvisor Listings Scrape extracts restaurants, hotels, and attractions from TripAdvisor listing pages using structured JSON-LD data. It automatically follows pagination and collects detailed business information. Ideal for large-scale location, review, and travel data collection.

Jamshaid Arif

TripAdvisor Scraper - Hotels, Restaurants & Attractions

thirdwatch/tripadvisor-scraper

Scrape TripAdvisor hotels, restaurants, and attractions: names, ratings, reviews, prices, rankings, addresses, images. Supports search queries with location and type filtering.