Expedia Reviews Scraper
Pricing
Pay per usage
Expedia Reviews Scraper
Scrape hotel reviews, ratings & guest feedback from Expedia at scale. Extract review metadata for competitive analysis, sentiment tracking, and market research. Reliable data extraction with structured output.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Shahid Irfan
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Extract guest reviews from Expedia hotel pages through Expedia's review GraphQL endpoints and build clean, analysis-ready datasets for research, monitoring, and reporting.
Collect review text, rating signals, traveler context, dates, and source metadata in a consistent output format designed for automation workflows.
Features
- API-based collection — Calls Expedia's persisted review GraphQL operations directly instead of relying on browser rendering or HTML parsing.
- Clean dataset output — Removes empty fields so records contain only meaningful values.
- Review metadata coverage — Captures rating, title, review text, dates, traveler profile, helpful-vote signals, management replies, sentiments, and guest-photo URLs when present.
- Duplicate protection — Avoids repeated items across repeated review loads.
- Production-ready defaults — Includes QA-friendly input defaults and proxy-ready configuration.
Use Cases
Reputation Monitoring
Track guest sentiment over time and quickly detect recurring service issues for specific hotels.
Hospitality Benchmarking
Compare feedback quality across properties, markets, and traveler profiles to identify competitive gaps.
Review Intelligence
Create structured review datasets for dashboards, trend reporting, and qualitative analysis projects.
Operations Improvement
Use common positive and negative themes to prioritize property-level service improvements.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
startUrl | String | Yes | https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information | Expedia hotel page URL to extract reviews from |
results_wanted | Integer | No | 20 | Maximum number of review records to save |
max_pages | Integer | No | 8 | Maximum Expedia review API pages to request |
proxyConfiguration | Object | No | Residential Apify Proxy | Proxy settings for reliable runs |
Output Data
Each dataset item may contain:
| Field | Type | Description |
|---|---|---|
hotel_id | String | Expedia hotel identifier parsed from URL |
hotel_name | String | Hotel name parsed from URL |
review_id | String | Review identifier |
rating | Number | Rating value |
title | String | Review headline |
review_text | String | Main review content |
published_date | String | Date the review was published |
stay_date | String | Travel or stay date |
traveler_type | String | Traveler profile type |
traveler_name | String | Reviewer display name |
traveler_location | String | Reviewer location |
language | String | Language code or label |
helpful_votes | Number | Helpful vote count |
review_sentiments | Array | Expedia review theme labels when present |
guest_photos | Array | Guest-submitted photo URLs when available |
management_response_title | String | Management reply heading |
response_text | String | Management response text |
property_url | String | Input property page URL |
source_url | String | Expedia GraphQL endpoint used for extraction |
scraped_at | String | ISO timestamp when item was saved |
Usage Examples
Basic Run
{"startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information","results_wanted": 20}
Higher Volume Collection
{"startUrl": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information","results_wanted": 120,"max_pages": 15}
Sample Output
{"hotel_id": "438504","hotel_name": "London Heathrow Marriott Hotel","review_id": "5f2f4d86-6a2d-492a-a0d6-6f945db2f17b","rating": 8,"title": "Comfortable stay near Heathrow","review_text": "Clean room, friendly staff, and fast airport connections. Breakfast options were strong.","published_date": "2026-03-14","stay_date": "2026-03","traveler_type": "Family","traveler_name": "Verified traveler","traveler_location": "Manchester","language": "en","helpful_votes": 3,"review_sentiments": ["Airport access", "Helpful staff"],"guest_photos": ["https://images.trvl-media.com/...jpg"],"management_response_title": "Response from Hotel Management","response_text": "Thank you for your feedback and for staying with us.","property_url": "https://www.expedia.com/London-Hotels-London-Heathrow-Marriott-Hotel.h438504.Hotel-Information","source_url": "https://www.expedia.com/graphql","scraped_at": "2026-04-05T11:36:24.511Z"}
Tips For Best Results
Use Stable Hotel URLs
- Prefer direct hotel page URLs with hotel identifiers.
- Confirm the page has visible guest reviews before large runs.
Start Small, Then Scale
- Begin with
results_wanted: 20to validate output quality. - Increase volume after confirming your target property works as expected.
Use Residential Proxy
- Residential proxy improves reliability for protected travel pages.
- Keep retry pressure low when running very high-volume workloads.
API Coverage Notes
- The actor combines Expedia's property review summary query with the paginated review overlay query.
- Optional fields vary by review, and the dataset omits empty values instead of filling them with nulls.
Integrations
- Google Sheets — Move structured reviews into live analysis sheets.
- Looker Studio / BI tools — Visualize rating trends and traveler segments.
- Airtable — Build searchable review intelligence workspaces.
- Webhooks — Trigger downstream processing as soon as each run completes.
Export Formats
- JSON — Best for APIs and engineering workflows.
- CSV — Best for spreadsheet and analyst workflows.
- Excel — Best for business-ready reporting.
- XML — Best for legacy integrations.
Frequently Asked Questions
How many reviews can I collect?
Collection volume depends on the property and available review depth. Increase max_pages and results_wanted for deeper collection.
Why is my dataset empty?
The page may be protected or not serving review payloads in that run context. Use residential proxy and verify the hotel URL.
Does every item include all fields?
No. The actor saves only non-empty values, so fields vary by the source review payload.
Can I run this on multiple hotels?
Yes. Schedule separate runs per URL or orchestrate multi-run workflows with your preferred automation tool.
Is duplicate handling included?
Yes. The actor applies duplicate protection across captured review records in each run.
Does this use a browser?
No. Review extraction is done with direct HTTP calls to Expedia's review GraphQL API.
Support
For issues or feature requests, open a support thread in Apify Console.
Resources
Legal Notice
This actor is designed for legitimate data collection and analysis workflows. You are responsible for compliance with website terms, local regulations, and responsible data usage practices.