Tripadvisor Reviews Scraper avatar

Tripadvisor Reviews Scraper

Try for free

Pay $2.00 for 1,000 reviews

View all Actors
Tripadvisor Reviews Scraper

Tripadvisor Reviews Scraper

maxcopell/tripadvisor-reviews
Try for free

Pay $2.00 for 1,000 reviews

Get and download reviews for chosen places on Tripadvisor. Extract the review text, URL, rating, date of travel, published date, basic reviewer info, owner's response, helpful votes, images, review language, place details. Download reviews in XML, JSON, CSV.

Do you want to learn more about this Actor?

Get a demo
XP

not getting output data for runs

Closed

xperium opened this issue
4 months ago

some actor run ids - eDpEF2RygBc7gPQ6z ltctjmRqI6gN6sUh8 QIgHOaEMAbc0o9pZr

these runs didn't give any results, there are many such runs. Please look into this at the earliest.

lukas.prusa avatar

Hi Xperium, thanks for opening this issue and providing the relevant run IDs!

The current output UI is getting some work done on this, but quick TLDR, just click on “All fields”. Even better in the JSON form.

You will find a new “error” field, which is not a part of the standard output. We output this field whenever a user input is incorrect and would otherwise lead to 0 results. In your case for example, you've provided a hotel which didn't have any relevant reviews for your lastReviewDate of 2024-07-17 at the time of the scrape.

The error result from your run looks like this:

1[
2  {
3    "error": "No relevant reviews found for the provided location. Please check your review filtering options.",
4    "input": "https://www.tripadvisor.in/Hotel_Review-g1945483-d1111763-Reviews-Heritage_Village_Resort_Spa_Manesar-Manesar_Gurgaon_District_Haryana.html",
5    "placeInfo": {
6      "id": "1111763",
7      "name": "Heritage Village Resorts & Spa, Manesar",
8      "rating": 4.5,
9      "numberOfReviews": 2913,
10      "locationString": "Gurugram (Gurgaon), Gurgaon District, Haryana",
11      "latitude": 28.366858,
12      "longitude": 76.93996,
13      "webUrl": "https://www.tripadvisor.com/Hotel_Review-g297615-d1111763-Reviews-Heritage_Village_Resorts_Spa_Manesar-Gurugram_Gurgaon_Gurgaon_District_Haryana.html",
14      "website": "https://www.heritagevillageresorts.com/heritage-village-resort-spa-manesar/",
15      "address": "Naharpur Road Nh-8, Gurugram (Gurgaon) 122050 ... [trimmed]
lukas.prusa avatar

We have some issues with user's not being able to closed issues, so I will leave this open just in case. Please close the issues if everything has been resolved for you :) Thanks!

XP

xperium

4 months ago

Hi Lukas, thanks for the prompt reply. https://www.tripadvisor.in/Hotel_Review-g1945483-d1111763-Reviews-Heritage_Village_Resort_Spa_Manesar-Manesar_Gurgaon_District_Haryana.html this is the url and last review date is 17/07/2024 and i can see there are reviews on 17th then why didn't we get any results

lukas.prusa avatar

Hmm, that's weird, I see the problem now, but it looks like a problem on TripAdvisor's side. Running the scraper today, it clearly extracts all of the relevant reviews. It's hard to tell now what happened there on the 19th, when you ran the scraper. But there should be nothing preventing the scraper from extracting the reviews, if they are on TripAdvisor.

My guess is that those reviews were somehow added a few days later than when they were actually posted? Maybe they were flagged by TripAdvisor, and they had to manually review them, afterward they approved them and the date stayed as the original one?

I've manually ran the last review date for some hotels now, but it has worked as expected with no issues...

XP

xperium

4 months ago

ok, I will monitor the runs for sometime and will ping you here if I find something of such sort again, or else will close the issue.

XP

xperium

4 months ago

Hey I have encountered the same issue with other runs, the reviews are there on website but didn't get them in the results -

actor run ids - thMaB8sPJHAcBBndB ,Ihgre351J6f70WF4g . You can use these 2 run ids to debug.

Please see how we can overcome this issue.

lukas.prusa avatar

Hi again, thanks for providing some runs where this issue occurred!

I've queried your profile for more runs for this hotel and found these very interesting few:

  • 2024-07-27T01:46:02.084Z (6OIqKHSeMmdp4sred): 3 reviews from 25th, no 26th nor 27th
  • 2024-07-27T13:45:59.870Z (7wk11Y09Flcic8V2D): 1x 26th, 1x 27th
  • 2024-07-28T01:45:58.419Z (OOkSqSgFKKNNhNFUJ): 2x 26th, 1x 27th

Clearly, between 2024-07-27T01:46:02.084Z and 2024-07-27T13:45:59.870Z a review was added to the past “from” the 26th. Between 2024-07-27T13:45:59.870Z and 2024-07-28T01:45:58.419Z another review was added to the past from the 26th. And to match even more, the latter 26th review was scraped and is placed on the website in a further order than the first 26th one, thus meaning that it's “older”.

I'm still convinced that this is a problem on TripAdvisor's side, and the data we scrape is exactly what is found on the website at that time. If you really want to test this, you can set up a job that will simply take a snapshot of the hotel page at certain times, and later compare it with the scraped data. However, I think you would just come up to the same conclusion as I do.

In fact, I've just tested it myself, creating a new review doesn't immediately add it to the website. This is my pending review: https://www.tripadvisor.com/ShowUserReviews-g297628-d26300342-r962574288-Bloom_Hub_ORR_Marathahalli-Bengaluru_Bangalore_District_Karnataka.html

I hope this helps, thanks!

lukas.prusa avatar

Hi, I'm closing this as I believe this is a TripAdvisor issue and not ours. Please reopen this back if you think it's still an issue, thanks!

Developer
Maintained by Apify
Actor metrics
  • 366 monthly users
  • 41 stars
  • 99.4% runs succeeded
  • 2.4 days response time
  • Created in Jan 2023
  • Modified 3 days ago
Categories