Hong Kong Transit Scraper - MTR, KMB, Citybus, Light Rail
Pricing
Pay per event
Hong Kong Transit Scraper - MTR, KMB, Citybus, Light Rail
Hong Kong public transit data via the official HKSAR open APIs: MTR (10 lines), KMB and Citybus bus networks, and MTR Light Rail. Bilingual English / Traditional Chinese station names, real-time train and bus ETAs, full route catalogues.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Hong Kong Transit Scraper
Scrapes Hong Kong's public transit network from the official HKSAR open data programme. Returns the full route and station catalogue plus real-time arrival ETAs across MTR (heavy rail), MTR Light Rail, KMB (Kowloon Motor Bus), and Citybus, all normalized into a single bilingual schema.
Hong Kong Transit Scraper Features
- Returns MTR heavy rail across all 10 lines — Island, Tsuen Wan, Kwun Tong, Tseung Kwan O, Tung Chung, Airport Express, East Rail, Tuen Ma, South Island, and Disneyland Resort.
- Returns MTR Light Rail (Tuen Mun / Yuen Long) routes 505 through 761P with bilingual stop names.
- Returns KMB — ~700 unique bus routes, ~6,000 stops, with Octopus stop IDs and lat/long coordinates.
- Returns Citybus — ~400 routes covering Hong Kong Island and cross-harbour services.
- Real-time ETAs for every operator: MTR Next Train, MTR Light Rail Next Train, KMB stop ETA, Citybus stop ETA.
- Bilingual English / Traditional Chinese station names, route termini, and ETA destinations.
- Pure JSON / CSV API scraping — no headless browser, no captcha plumbing, no proxy required.
Who Uses Hong Kong Transit Data?
- Travel-tech apps — power MTR / bus journey planners and arrival-board widgets for Hong Kong's 20M+ annual inbound visitors.
- MaaS startups — feed multimodal trip planners with normalized rail + bus data across all four operators.
- Tourism analytics — measure transit accessibility for hotels, attractions, and conference venues across Hong Kong Island, Kowloon, and the New Territories.
- Logistics & delivery — map last-mile coverage against bus stops and MTR exits.
- Internal dashboards — populate dropdowns and validate user-typed origin / destination stations against the real HKSAR transit catalogue.
How Hong Kong Transit Scraper Works
- Validates the input mode and operator. Rejects unsupported combinations (e.g.
route_stopson MTR — heavy rail usesstation_listinstead). - Pulls the right HKSAR open-data endpoint for the chosen
(operator, mode):- MTR catalogue →
opendata.mtr.com.hk/data/mtr_lines_and_stations.csv - MTR Light Rail catalogue →
opendata.mtr.com.hk/data/light_rail_routes_and_stops.csv - MTR Next Train →
rt.data.gov.hk/v1/transport/mtr/getSchedule.php - MTR Light Rail Next Train →
rt.data.gov.hk/v1/transport/mtr/lrnt/getSchedule - KMB →
data.etabus.gov.hk/v1/transport/kmb/... - Citybus →
rt.data.gov.hk/v2/transport/citybus/...
- MTR catalogue →
- Normalizes each operator's response into a single bilingual schema with consistent field names.
- Resolves bus stop IDs to lat/long + bilingual names with bounded fan-out concurrency.
- Emits one flat record per route, station, route-stop, or live ETA, capped by
maxItems.
All endpoints are unauthenticated and operated by the Hong Kong government or a participating operator. No API key, no proxy, no auth headers required.
Input
{"mode": "route_list","operator": "kmb","maxItems": 15}
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | route_list | One of route_list, station_list, route_stops, stop_eta. See modes below. |
operator | string | mtr | One of mtr, mtr_lr, kmb, citybus. |
route | string | 1A | Bus route number (KMB / Citybus) or MTR line code (TKL, ISL, etc.). Required for route_stops, optional for stop_eta to filter. |
direction | string | outbound | Direction. outbound / inbound for buses, UP / DOWN for MTR. Used by route_stops. |
stop_id | string | — | Operator-specific stop identifier. Required for stop_eta. KMB uses 16-character hex (A3ADFCDF8487ADB9); Citybus uses 6-digit numeric (001027); MTR uses 3-letter station codes (CEN, TST); Light Rail uses 3-letter stop codes. |
mtr_line | string | — | Required for stop_eta on MTR. The MTR Next Train API needs both station and line code. |
maxItems | integer | 15 | Maximum records to emit. Default is 15 to keep test runs fast. Set higher for full inventory dumps. |
proxyConfiguration | object | no proxy | HK Open Data APIs are public — proxy not required. Honoured if you opt in. |
Modes
| Mode | What it returns |
|---|---|
route_list | Every route operated by the chosen operator. |
station_list | Full station / stop inventory with bilingual names and (for buses) lat/long coordinates. |
route_stops | Ordered stop sequence for a single bus route. Bus-only — for MTR use station_list. |
stop_eta | Real-time arrival ETAs at a single stop. |
Run examples
KMB route catalogue:
{"mode": "route_list","operator": "kmb","maxItems": 1000}
MTR Tseung Kwan O Line — live arrivals at Tiu Keng Leng:
{"mode": "stop_eta","operator": "mtr","stop_id": "TIK","mtr_line": "TKL"}
KMB bus 1A — full outbound stop list with bilingual names + lat/long:
{"mode": "route_stops","operator": "kmb","route": "1A","direction": "outbound","maxItems": 100}
Citybus stop ETAs at Central (Macao Ferry):
{"mode": "stop_eta","operator": "citybus","stop_id": "001027"}
Hong Kong Transit Scraper Output Fields
Every record carries a record_type. The unused fields for that record type are empty strings or null.
record_type | Mode | Operator |
|---|---|---|
route | route_list | all |
station | station_list | all |
route_stop | route_stops | KMB, Citybus |
eta | stop_eta | all |
Route record example (KMB)
{"record_type": "route","operator": "KMB","service_type": "bus","route_number": "1A","route_origin_en": "STAR FERRY","route_origin_zh": "尖沙咀碼頭","route_dest_en": "SAU MAU PING (CENTRAL)","route_dest_zh": "中秀茂坪","direction": "inbound","service_class": "1","source_url": "https://data.etabus.gov.hk/v1/transport/kmb/route/","scraped_at": "2026-05-02T16:56:14.723Z"}
Station record example (MTR)
{"record_type": "station","operator": "MTR","service_type": "metro","line_code": "TKL","line_name": "Tseung Kwan O Line","direction": "DT","station_code": "NOP","station_name_en": "North Point","station_name_zh": "北角","sequence": 1,"source_url": "https://opendata.mtr.com.hk/data/mtr_lines_and_stations.csv","scraped_at": "2026-05-02T16:56:14.723Z"}
ETA record example (MTR live)
{"record_type": "eta","operator": "MTR","service_type": "metro","line_code": "TKL","line_name": "Tseung Kwan O Line","station_code": "TIK","direction": "UP","eta_seq": 1,"eta_time": "2026-05-03 00:36:28","eta_minutes": 3,"eta_destination_en": "POA","platform": "3","source_url": "https://rt.data.gov.hk/v1/transport/mtr/getSchedule.php?line=TKL&sta=TIK","scraped_at": "2026-05-02T16:56:14.723Z"}
Field reference
| Field | Type | Description |
|---|---|---|
record_type | string | route, station, route_stop, or eta. |
operator | string | MTR, MTR_LR, KMB, or CTB. |
service_type | string | metro / light_rail / bus / airport_express. |
route_number | string | Bus route number or MTR line code. |
route_origin_en / route_origin_zh | string | Origin terminus (English / Traditional Chinese). |
route_dest_en / route_dest_zh | string | Destination terminus. |
direction | string | outbound / inbound for buses; UP / DOWN for MTR. |
service_class | string | KMB service variant. 1 is the primary route; higher numbers are alternates. |
line_code / line_name | string | MTR line identifier and full name. |
station_code | string | Operator-specific stop identifier. |
station_name_en / station_name_zh | string | Stop / station name. |
sequence | number | Stop sequence on a route (1-based). |
latitude / longitude | number | WGS84 coordinates. Buses only — MTR catalogue does not publish coordinates. |
eta_seq | number | ETA index (1, 2, 3 for the next three arrivals). |
eta_time | string | ETA timestamp (ISO 8601 / HKT). |
eta_minutes | number | Minutes to next arrival. |
eta_destination_en / eta_destination_zh | string | Arriving service's destination. |
platform | string | MTR platform number. |
remarks_en / remarks_zh | string | Operator-published remark (e.g. "Bus is full", "Service ended"). |
data_timestamp | string | Operator-published data timestamp. |
source_url | string | Source endpoint the record was derived from. |
scraped_at | string | ISO 8601 timestamp when the record was scraped. |
FAQ
How do I scrape Hong Kong transit data?
Pick a mode (route_list, station_list, route_stops, stop_eta) and an operator (mtr, mtr_lr, kmb, citybus), then run. The scraper hits the right HKSAR open-data endpoint, normalizes the response, and emits flat records.
Does Hong Kong Transit Scraper need an API key?
No. The HKSAR open-data programme publishes these endpoints for free public use. No registration, no token, no rate-limit headers.
Does the scraper return fares?
Not in v1. Neither KMB nor Citybus expose fares via the open APIs, and MTR fares are a separate closed download. Fare modes are out of scope for this version.
What about Star Ferry, First Ferry, TurboJET, or HK Tramways?
Out of scope. Star Ferry, First Ferry, and TurboJET publish PDF schedules (no structured API). HK Tramways has no public data feed at all. Adding them would require HTML scraping that is materially more work than the API-only surface.
How do I find a stop ID for stop_eta?
Run station_list mode for the operator first. The station_code field on each record is the ID you pass to stop_eta. For MTR, also note the line_code — stop_eta on MTR needs both.
How fresh is the data?
ETAs are live (~30 second refresh). Route and stop catalogues are refreshed daily by KMB and Citybus, quarterly by MTR.
How many records does a full run produce?
Per operator at full inventory: KMB ~6,000 stops or ~1,600 route-direction-variants; Citybus ~3,000 stops or ~400 routes; MTR 99 stations × 2 directions = ~270 station rows; Light Rail ~70 stops. Set maxItems accordingly — default 15 keeps test runs fast.
Need More Features?
Need ferry coverage, fare data, journey-planning A→B routing, or a GTFS-RT export? File an issue or get in touch.
Why Use Hong Kong Transit Scraper?
- Free APIs, low cost — pay-per-event pricing, ~$0.0008 per record at the default coefficient. A full Hong Kong network catalogue dump costs less than a single Octopus tap.
- Bilingual & normalized — every record carries both English and Traditional Chinese names. Field names are consistent across MTR, KMB, Citybus, and Light Rail.
- Stable — no headless browser, no captcha solver, no scraping-the-DOM heuristics. Just public HKSAR open-data endpoints over HTTPS.