Tripadvisor Scraper avatar
Tripadvisor Scraper

Pricing

$3.00 / 1,000 results

Go to Store
Tripadvisor Scraper

Tripadvisor Scraper

maxcopell/tripadvisor

Developed by

Maximillian Copelli

Maintained by Apify

This unofficial Tripadvisor API is a data extraction tool able to get data on hotels, restaurants, things to do, vacation rentals, attractions, tours, and public trips. Get pricing, contact details, amenities, awards, ratings, and more. Download your data in Excel, JSON, CSV, and other formats.

4.8 (9)

Pricing

$3.00 / 1,000 results

109

Monthly users

472

Runs succeeded

>99%

Response time

7.2 days

Last modified

23 days ago

GE

Scraping with inputting startUrl of a specific hotel isn't working

Closed
agenthub opened this issue
10 months ago

Say for example:

""" headers = { 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0', 'Content-type': 'application/json' }

params = { "includeAttractions": False, "includeRestaurants": False, "includeHotels": True, "includeVacationRentals": False, "startUrls": [ "https://www.tripadvisor.com/Hotel_Review-g297550-d302126-Reviews-Jaz_Makadi_Oasis_Resort-Makadi_Bay_Hurghada_Red_Sea_and_Sinai.html", ], "language": "en", "currency": "USD", "maxItems": 1, "endPage": 1, "extendOutputFunction": "($) => { return {} }", "customMapFunction": "(object) => { return {...object} }", "proxy": { "useApifyProxy": True, "apifyProxyGroups": [ "RESIDENTIAL" ]
} } """

It's getting rejected with:

{'error': {'type': 'invalid-input', 'message': 'Input is not valid: Items in input.startUrls at positions [0] do not contain valid URLs'}}

Even though: "https://www.tripadvisor.com/Hotel_Review-g297550-d302126-Reviews-Jaz_Makadi_Oasis_Resort-Makadi_Bay_Hurghada_Red_Sea_and_Sinai.html"

is indeed a valid URL

lukas.prusa avatar

Hi, thanks for opening this issue!

I think you are mistaking the input for this Actor. The input that you've provided is in a completely different format than this scraper takes.

The error you are seeing, is as expected, the provided start URL was not in a valid format.

This is the default input for the scraper:

1{
2    "currency": "USD",
3    "includeAiReviewsSummary": false,
4    "includeAttractions": true,
5    "includeHotels": true,
6    "includePriceOffers": false,
7    "includeRestaurants": true,
8    "includeTags": true,
9    "includeVacationRentals": false,
10    "language": "en",
11    "locationFullName": "Chicago",
12    "maxItemsPerQuery": 10
13}

I hope this helps, thanks and happy scraping!

lukas.prusa avatar

Sorry for the inconvenience, the automatic displayed documentation has a little flawed UI there. The input for start URLs is an array of sources.

So in your case, your case you can simply use:

1"startUrls": [
2    {
3        "url": "https://www.tripadvisor.com/Hotel_Review-g297550-d302126-Reviews-Jaz_Makadi_Oasis_Resort-Makadi_Bay_Hurghada_Red_Sea_and_Sinai.html",
4    }
5]

Also, you can try out the UI input schema, which nicely formats it for you and makes it easier to edit.

Thanks!

lukas.prusa avatar

Awesome, I'm glad it helped.

Pricing

Pricing model

Pay per result 

This Actor is paid per result. You are not charged for the Apify platform usage, but only a fixed price for each dataset of 1,000 items in the Actor outputs.

Price per 1,000 items

$3.00