# Hostelworld Hotels Scraper (`shahidirfan/hostelworld-hotels-scraper`) Actor

Extract hostel listings from Hostelworld globally. Get prices, ratings, amenities, reviews, locations & booking links. Ideal for travel price comparison, competitive intelligence, booking aggregation & accommodation market research. Real-time, production-ready output.

- **URL**: https://apify.com/shahidirfan/hostelworld-hotels-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** Travel, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Hostelworld Hotels Scraper

Extract complete accommodation listings from Hostelworld city pages and search URLs. Collect hotels, hostels, and mixed-property results with pricing, ratings, facilities, rooms, images, and location data in a structured dataset. Useful for travel research, accommodation comparison, pricing analysis, and hospitality market monitoring.

### Features

- **Clean and messy URL support** — Accepts standard city pages plus search-style URLs that include query parameters and city IDs
- **Hotels and hostels together** — Works with `/hotels/`, `/hostels/`, and mixed listing pages
- **Rich property coverage** — Captures property details, pricing layers, review scores, facilities, rooms, promotions, and image links
- **Automatic pagination** — Keeps fetching pages until your result limit or page cap is reached
- **Duplicate protection** — Prevents repeated properties from being added to the dataset
- **Smart internal defaults** — Reuses dates and guest counts from messy URLs when present, otherwise falls back internally
- **Failure diagnostics** — Detects broken listing responses and logs clear recovery signals instead of failing silently

### Use Cases

#### Travel Planning
Compare accommodation options across cities before booking. Review price ranges, guest ratings, district information, and room availability in one dataset.

#### Market Intelligence
Track pricing and property positioning across destinations. Identify popular neighborhoods, promoted listings, and differences between hostels and hotels.

#### Hospitality Benchmarking
Analyze review breakdowns such as cleanliness, staff, facilities, and value for money. Use the data to compare property performance across markets.

#### Content and Comparison Sites
Build city guides, destination roundups, and accommodation comparison pages with structured listing data and direct property links.

#### Research and Analytics
Create datasets for tourism research, hospitality dashboards, or competitor monitoring with repeatable city-level collection.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `startUrl` | String | Yes | — | Hostelworld city or search URL. Works with clean `/hostels/` and `/hotels/` links plus messy search URLs such as `/pwa/s?...&id=13`. |
| `results_wanted` | Integer | No | `20` | Maximum number of properties to collect. |
| `max_pages` | Integer | No | `5` | Safety cap on the number of listing pages to fetch. |
| `proxyConfiguration` | Object | No | `{"useApifyProxy": false}` | Optional Apify Proxy configuration. |

---

### Output Data

Each dataset item can contain:

| Field | Type | Description |
|-------|------|-------------|
| `property_id` | String | Hostelworld property ID |
| `property_name` | String | Property name |
| `property_type` | String | Property type such as hostel or hotel |
| `star_rating` | Number | Official star rating |
| `address` | String | Street address |
| `district` | String | Primary district or neighborhood |
| `districts` | Array | Additional district names when available |
| `description` | String | Property overview text |
| `images` | Array | Property image URLs |
| `image_count` | Number | Number of collected image URLs |
| `city` | String | City name |
| `country` | String | Country name |
| `latitude` | Number | Latitude |
| `longitude` | Number | Longitude |
| `distance_km` | Number | Distance from city center in kilometers |
| `rating_overall` | Number | Overall guest rating |
| `total_ratings` | Number | Number of guest ratings |
| `rating_security` | Number | Security rating |
| `rating_location` | Number | Location rating |
| `rating_staff` | Number | Staff rating |
| `rating_atmosphere` | Number | Atmosphere rating |
| `rating_cleanliness` | Number | Cleanliness rating |
| `rating_facilities` | Number | Facilities rating |
| `rating_value_for_money` | Number | Value for money rating |
| `price_from` | String | Lowest displayed price per night |
| `price_currency` | String | Currency used for pricing |
| `lowest_dorm_price` | String | Lowest dorm price per night |
| `lowest_private_price` | String | Lowest private room price per night |
| `average_price` | String | Average lowest nightly price |
| `average_price_original` | String | Original average nightly price before discount |
| `average_dorm_price` | String | Average dorm price |
| `average_private_price` | String | Average private room price |
| `free_cancellation` | Boolean | Whether free cancellation is available |
| `free_cancellation_until` | String | Free cancellation deadline when available |
| `is_promoted` | Boolean | Whether the property is promoted |
| `is_featured` | Boolean | Whether the property is featured |
| `is_new` | Boolean | Whether the property is newly listed |
| `very_popular` | Boolean | Whether the property is marked as very popular |
| `hostelworld_recommends` | Boolean | Whether the property is recommended |
| `facilities` | Array | Facility names |
| `facilities_count` | Number | Number of facilities collected |
| `rooms` | Array | Room summaries with type, capacity, ensuite flag, and price |
| `room_types_count` | Number | Number of room entries collected |
| `promotions` | Array | Promotion summaries |
| `discount_percent` | Number | Discount percentage when available |
| `property_url` | String | Direct property link |
| `search_url` | String | Canonical city search URL used for the run |

---

### Usage Examples

#### Standard Hostel City Page

```json
{
  "startUrl": "https://www.hostelworld.com/hostels/europe/spain/barcelona/",
  "results_wanted": 20,
  "max_pages": 5
}
````

#### Hotel City Page

```json
{
  "startUrl": "https://www.hostelworld.com/hotels/europe/spain/barcelona/",
  "results_wanted": 50,
  "max_pages": 5
}
```

#### Messy Search URL

```json
{
  "startUrl": "https://www.hostelworld.com/pwa/s?q=New%20York,%20USA&country=USA&city=New%20York&type=city&id=13&from=2026-06-16&to=2026-06-19&guests=2&page=1",
  "results_wanted": 20,
  "max_pages": 3
}
```

***

### Sample Output

```json
{
  "property_id": "61557",
  "property_name": "St Christopher's Inn Barcelona",
  "property_type": "HOSTEL",
  "address": "Carrer de Bergara, 3",
  "district": "Las Ramblas",
  "city": "Barcelona",
  "country": "Spain",
  "latitude": 41.3861073,
  "longitude": 2.16762,
  "distance_km": 0.22,
  "rating_overall": 85,
  "total_ratings": 15633,
  "rating_security": 90,
  "rating_location": 96,
  "rating_staff": 85,
  "rating_atmosphere": 82,
  "rating_cleanliness": 82,
  "rating_facilities": 80,
  "rating_value_for_money": 81,
  "price_from": "33.41",
  "price_currency": "USD",
  "lowest_dorm_price": "33.41",
  "lowest_private_price": "217.34",
  "free_cancellation": false,
  "facilities_count": 18,
  "room_types_count": 6,
  "property_url": "https://www.hostelworld.com/hostels/p/61557/st-christophers-inn-barcelona/",
  "search_url": "https://www.hostelworld.com/hostels/europe/spain/barcelona/"
}
```

***

### Tips for Best Results

#### Use Real City or Search URLs

- Copy URLs directly from Hostelworld city pages or search pages
- Both clean city links and query-based search URLs are supported
- If the URL already contains dates or guests, those values are reused automatically unless you override them

#### Control Collection Size

- Start with `results_wanted: 20` for quick checks
- Increase the result limit for larger cities with many pages
- Use `max_pages` as a hard safety cap when testing new markets

#### Price Comparisons

- Keep the same type of source URL structure across comparison runs
- Different travel dates embedded in the source URL can change availability and rates significantly
- Compare similar stay setups when building market benchmarks

***

### Integrations

Connect your dataset with:

- **Google Sheets** — Review pricing and ratings in spreadsheets
- **Airtable** — Build searchable accommodation databases
- **Make** — Trigger downstream travel workflows
- **Zapier** — Send new run results into business automations
- **Webhooks** — Push fresh listing data to your own systems
- **Slack** — Notify teams when collection finishes

#### Export Formats

- **JSON** — For applications and automation
- **CSV** — For spreadsheets and flat-file analysis
- **Excel** — For business reporting
- **XML** — For system integrations

***

### Frequently Asked Questions

#### Does it work with both hotels and hostels?

Yes. The actor accepts Hostelworld hotel pages, hostel pages, and mixed city listing URLs.

#### Can it handle messy search URLs?

Yes. Search-style URLs with query parameters, dates, guest counts, and direct city IDs are supported.

#### Will it collect all properties in a city?

It keeps collecting until it reaches `results_wanted`, the source runs out of pages, or `max_pages` is reached.

#### Why do some fields appear only on some properties?

Hostelworld does not expose every field for every listing. Missing fields are omitted instead of filled with empty placeholder values.

#### Can I override the dates embedded in a URL?

No exposed override is required. The actor derives stay settings from the source URL when available and otherwise uses internal defaults.

#### What happens if the source changes?

The actor retries listing requests with alternate browser-like headers, logs response-shape mismatches clearly, and probes the source URL for recovery signals before failing.

***

### Support

For issues or feature requests, contact support through the Apify Console.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [API Reference](https://docs.apify.com/api/v2)
- [Scheduling Runs](https://docs.apify.com/schedules)

***

### Legal Notice

This actor is designed for legitimate data collection purposes. Users are responsible for ensuring compliance with website terms of service and applicable laws. Use data responsibly and respect rate limits.

# Actor input Schema

## `startUrl` (type: `string`):

A Hostelworld city or search URL. Works with clean /hostels/ and /hotels/ links plus messy search URLs such as /pwa/s?...\&id=13.

## `results_wanted` (type: `integer`):

Maximum number of properties to collect per run.

## `max_pages` (type: `integer`):

Safety cap on the number of listing pages to fetch per city.

## `proxyConfiguration` (type: `object`):

Optional Apify Proxy settings for network resilience.

## Actor input object example

```json
{
  "startUrl": "https://www.hostelworld.com/hostels/europe/spain/barcelona/",
  "results_wanted": 20,
  "max_pages": 5,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrl": "https://www.hostelworld.com/hostels/europe/spain/barcelona/",
    "results_wanted": 20,
    "max_pages": 5
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/hostelworld-hotels-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrl": "https://www.hostelworld.com/hostels/europe/spain/barcelona/",
    "results_wanted": 20,
    "max_pages": 5,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/hostelworld-hotels-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrl": "https://www.hostelworld.com/hostels/europe/spain/barcelona/",
  "results_wanted": 20,
  "max_pages": 5
}' |
apify call shahidirfan/hostelworld-hotels-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/hostelworld-hotels-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hostelworld Hotels Scraper",
        "description": "Extract hostel listings from Hostelworld globally. Get prices, ratings, amenities, reviews, locations & booking links. Ideal for travel price comparison, competitive intelligence, booking aggregation & accommodation market research. Real-time, production-ready output.",
        "version": "0.0",
        "x-build-id": "1awfflmeZE1JwTpjj"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~hostelworld-hotels-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-hostelworld-hotels-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~hostelworld-hotels-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-hostelworld-hotels-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~hostelworld-hotels-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-hostelworld-hotels-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrl"
                ],
                "properties": {
                    "startUrl": {
                        "title": "City URL",
                        "type": "string",
                        "description": "A Hostelworld city or search URL. Works with clean /hostels/ and /hotels/ links plus messy search URLs such as /pwa/s?...&id=13."
                    },
                    "results_wanted": {
                        "title": "Maximum results",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of properties to collect per run.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Maximum pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety cap on the number of listing pages to fetch per city.",
                        "default": 5
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional Apify Proxy settings for network resilience.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
