# Expedia Reviews Scraper (`shahidirfan/expedia-reviews-scraper`) Actor

Scrape hotel reviews, ratings & guest feedback from Expedia at scale. Extract review metadata for competitive analysis, sentiment tracking, and market research. Reliable data extraction with structured output.

- **URL**: https://apify.com/shahidirfan/expedia-reviews-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** Travel, Automation, Developer tools
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Expedia Reviews Scraper

Extract guest reviews from Expedia hotel pages through Expedia's review GraphQL endpoints and build clean, analysis-ready datasets for research, monitoring, and reporting.

Collect review text, rating signals, traveler context, dates, and source metadata in a consistent output format designed for automation workflows.

---

### Features

- **API-based collection** — Calls Expedia's persisted review GraphQL operations directly instead of relying on browser rendering or HTML parsing.
- **Clean dataset output** — Removes empty fields so records contain only meaningful values.
- **Review metadata coverage** — Captures rating, title, review text, dates, traveler profile, helpful-vote signals, management replies, sentiments, and guest-photo URLs when present.
- **Duplicate protection** — Avoids repeated items across repeated review loads.
- **Production-ready defaults** — Includes QA-friendly input defaults and proxy-ready configuration.

---

### Use Cases

#### Reputation Monitoring
Track guest sentiment over time and quickly detect recurring service issues for specific hotels.

#### Hospitality Benchmarking
Compare feedback quality across properties, markets, and traveler profiles to identify competitive gaps.

#### Review Intelligence
Create structured review datasets for dashboards, trend reporting, and qualitative analysis projects.

#### Operations Improvement
Use common positive and negative themes to prioritize property-level service improvements.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `startUrl` | String | Yes | `https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information` | Expedia hotel page URL to extract reviews from |
| `results_wanted` | Integer | No | `20` | Maximum number of review records to save |
| `max_pages` | Integer | No | `8` | Maximum Expedia review API pages to request |
| `proxyConfiguration` | Object | No | Residential Apify Proxy | Proxy settings for reliable runs |

---

### Output Data

Each dataset item may contain:

| Field | Type | Description |
|-------|------|-------------|
| `hotel_id` | String | Expedia hotel identifier parsed from URL |
| `hotel_name` | String | Hotel name parsed from URL |
| `review_id` | String | Review identifier |
| `rating` | Number | Rating value |
| `title` | String | Review headline |
| `review_text` | String | Main review content |
| `published_date` | String | Date the review was published |
| `stay_date` | String | Travel or stay date |
| `traveler_type` | String | Traveler profile type |
| `traveler_name` | String | Reviewer display name |
| `traveler_location` | String | Reviewer location |
| `language` | String | Language code or label |
| `helpful_votes` | Number | Helpful vote count |
| `review_sentiments` | Array | Expedia review theme labels when present |
| `guest_photos` | Array | Guest-submitted photo URLs when available |
| `management_response_title` | String | Management reply heading |
| `response_text` | String | Management response text |
| `property_url` | String | Input property page URL |
| `source_url` | String | Expedia GraphQL endpoint used for extraction |
| `scraped_at` | String | ISO timestamp when item was saved |

---

### Usage Examples

#### Basic Run

```json
{
	"startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
	"results_wanted": 20
}
````

#### Higher Volume Collection

```json
{
	"startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
	"results_wanted": 120,
	"max_pages": 15
}
```

### Sample Output

```json
{
	"hotel_id": "438504",
	"hotel_name": "London Heathrow Marriott Hotel",
	"review_id": "5f2f4d86-6a2d-492a-a0d6-6f945db2f17b",
	"rating": 8,
	"title": "Comfortable stay near Heathrow",
	"review_text": "Clean room, friendly staff, and fast airport connections. Breakfast options were strong.",
	"published_date": "2026-03-14",
	"stay_date": "2026-03",
	"traveler_type": "Family",
	"traveler_name": "Verified traveler",
	"traveler_location": "Manchester",
	"language": "en",
	"helpful_votes": 3,
	"review_sentiments": ["Airport access", "Helpful staff"],
	"guest_photos": ["https://images.trvl-media.com/...jpg"],
	"management_response_title": "Response from Hotel Management",
	"response_text": "Thank you for your feedback and for staying with us.",
	"property_url": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
	"source_url": "https://www.expedia.com/graphql",
	"scraped_at": "2026-04-05T11:36:24.511Z"
}
```

***

### Tips For Best Results

#### Use Stable Hotel URLs

- Prefer direct hotel page URLs with hotel identifiers.
- Confirm the page has visible guest reviews before large runs.

#### Start Small, Then Scale

- Begin with `results_wanted: 20` to validate output quality.
- Increase volume after confirming your target property works as expected.

#### Use Residential Proxy

- Residential proxy improves reliability for protected travel pages.
- Keep retry pressure low when running very high-volume workloads.

#### API Coverage Notes

- The actor combines Expedia's property review summary query with the paginated review overlay query.
- Optional fields vary by review, and the dataset omits empty values instead of filling them with nulls.

***

### Integrations

- **Google Sheets** — Move structured reviews into live analysis sheets.
- **Looker Studio / BI tools** — Visualize rating trends and traveler segments.
- **Airtable** — Build searchable review intelligence workspaces.
- **Webhooks** — Trigger downstream processing as soon as each run completes.

#### Export Formats

- **JSON** — Best for APIs and engineering workflows.
- **CSV** — Best for spreadsheet and analyst workflows.
- **Excel** — Best for business-ready reporting.
- **XML** — Best for legacy integrations.

***

### Frequently Asked Questions

#### How many reviews can I collect?

Collection volume depends on the property and available review depth. Increase `max_pages` and `results_wanted` for deeper collection.

#### Why is my dataset empty?

The page may be protected or not serving review payloads in that run context. Use residential proxy and verify the hotel URL.

#### Does every item include all fields?

No. The actor saves only non-empty values, so fields vary by the source review payload.

#### Can I run this on multiple hotels?

Yes. Schedule separate runs per URL or orchestrate multi-run workflows with your preferred automation tool.

#### Is duplicate handling included?

Yes. The actor applies duplicate protection across captured review records in each run.

#### Does this use a browser?

No. Review extraction is done with direct HTTP calls to Expedia's review GraphQL API.

***

### Support

For issues or feature requests, open a support thread in Apify Console.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify API Reference](https://docs.apify.com/api/v2)
- [Apify Scheduling](https://docs.apify.com/platform/schedules)

***

### Legal Notice

This actor is designed for legitimate data collection and analysis workflows. You are responsible for compliance with website terms, local regulations, and responsible data usage practices.

# Actor input Schema

## `startUrl` (type: `string`):

Hotel page URL (Hotel-Information page recommended).

## `hotelId` (type: `string`):

Optional. Enter property ID directly (digits only), for example: 438504.

## `results_wanted` (type: `integer`):

Maximum number of review items to save.

## `max_pages` (type: `integer`):

Maximum number of Expedia review API pages to request.

## `proxyConfiguration` (type: `object`):

Residential proxy is strongly recommended for reliable extraction.

## Actor input object example

```json
{
  "startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
  "results_wanted": 20,
  "max_pages": 8,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
    "results_wanted": 20,
    "max_pages": 8
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/expedia-reviews-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
    "results_wanted": 20,
    "max_pages": 8,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/expedia-reviews-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information",
  "results_wanted": 20,
  "max_pages": 8
}' |
apify call shahidirfan/expedia-reviews-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/expedia-reviews-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Expedia Reviews Scraper",
        "description": "Scrape hotel reviews, ratings & guest feedback from Expedia at scale. Extract review metadata for competitive analysis, sentiment tracking, and market research. Reliable data extraction with structured output.",
        "version": "1.0",
        "x-build-id": "4JPCHNxyY3G4jvgIL"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~expedia-reviews-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-expedia-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~expedia-reviews-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-expedia-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~expedia-reviews-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-expedia-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrl": {
                        "title": "Expedia hotel URL",
                        "type": "string",
                        "description": "Hotel page URL (Hotel-Information page recommended)."
                    },
                    "hotelId": {
                        "title": "Expedia hotel ID",
                        "type": "string",
                        "description": "Optional. Enter property ID directly (digits only), for example: 438504."
                    },
                    "results_wanted": {
                        "title": "Maximum reviews",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of review items to save.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Maximum review pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of Expedia review API pages to request.",
                        "default": 8
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Residential proxy is strongly recommended for reliable extraction.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
