Hong Kong Transit Scraper - MTR, KMB, Citybus, Light Rail avatar

Hong Kong Transit Scraper - MTR, KMB, Citybus, Light Rail

Pricing

Pay per event

Go to Apify Store
Hong Kong Transit Scraper - MTR, KMB, Citybus, Light Rail

Hong Kong Transit Scraper - MTR, KMB, Citybus, Light Rail

Hong Kong public transit data via the official HKSAR open APIs: MTR (10 lines), KMB and Citybus bus networks, and MTR Light Rail. Bilingual English / Traditional Chinese station names, real-time train and bus ETAs, full route catalogues.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Hong Kong Transit Scraper

Scrapes Hong Kong's public transit network from the official HKSAR open data programme. Returns the full route and station catalogue plus real-time arrival ETAs across MTR (heavy rail), MTR Light Rail, KMB (Kowloon Motor Bus), and Citybus, all normalized into a single bilingual schema.


Hong Kong Transit Scraper Features

  • Returns MTR heavy rail across all 10 lines — Island, Tsuen Wan, Kwun Tong, Tseung Kwan O, Tung Chung, Airport Express, East Rail, Tuen Ma, South Island, and Disneyland Resort.
  • Returns MTR Light Rail (Tuen Mun / Yuen Long) routes 505 through 761P with bilingual stop names.
  • Returns KMB — ~700 unique bus routes, ~6,000 stops, with Octopus stop IDs and lat/long coordinates.
  • Returns Citybus — ~400 routes covering Hong Kong Island and cross-harbour services.
  • Real-time ETAs for every operator: MTR Next Train, MTR Light Rail Next Train, KMB stop ETA, Citybus stop ETA.
  • Bilingual English / Traditional Chinese station names, route termini, and ETA destinations.
  • Pure JSON / CSV API scraping — no headless browser, no captcha plumbing, no proxy required.

Who Uses Hong Kong Transit Data?

  • Travel-tech apps — power MTR / bus journey planners and arrival-board widgets for Hong Kong's 20M+ annual inbound visitors.
  • MaaS startups — feed multimodal trip planners with normalized rail + bus data across all four operators.
  • Tourism analytics — measure transit accessibility for hotels, attractions, and conference venues across Hong Kong Island, Kowloon, and the New Territories.
  • Logistics & delivery — map last-mile coverage against bus stops and MTR exits.
  • Internal dashboards — populate dropdowns and validate user-typed origin / destination stations against the real HKSAR transit catalogue.

How Hong Kong Transit Scraper Works

  1. Validates the input mode and operator. Rejects unsupported combinations (e.g. route_stops on MTR — heavy rail uses station_list instead).
  2. Pulls the right HKSAR open-data endpoint for the chosen (operator, mode):
    • MTR catalogue → opendata.mtr.com.hk/data/mtr_lines_and_stations.csv
    • MTR Light Rail catalogue → opendata.mtr.com.hk/data/light_rail_routes_and_stops.csv
    • MTR Next Train → rt.data.gov.hk/v1/transport/mtr/getSchedule.php
    • MTR Light Rail Next Train → rt.data.gov.hk/v1/transport/mtr/lrnt/getSchedule
    • KMB → data.etabus.gov.hk/v1/transport/kmb/...
    • Citybus → rt.data.gov.hk/v2/transport/citybus/...
  3. Normalizes each operator's response into a single bilingual schema with consistent field names.
  4. Resolves bus stop IDs to lat/long + bilingual names with bounded fan-out concurrency.
  5. Emits one flat record per route, station, route-stop, or live ETA, capped by maxItems.

All endpoints are unauthenticated and operated by the Hong Kong government or a participating operator. No API key, no proxy, no auth headers required.


Input

{
"mode": "route_list",
"operator": "kmb",
"maxItems": 15
}
FieldTypeDefaultDescription
modestringroute_listOne of route_list, station_list, route_stops, stop_eta. See modes below.
operatorstringmtrOne of mtr, mtr_lr, kmb, citybus.
routestring1ABus route number (KMB / Citybus) or MTR line code (TKL, ISL, etc.). Required for route_stops, optional for stop_eta to filter.
directionstringoutboundDirection. outbound / inbound for buses, UP / DOWN for MTR. Used by route_stops.
stop_idstringOperator-specific stop identifier. Required for stop_eta. KMB uses 16-character hex (A3ADFCDF8487ADB9); Citybus uses 6-digit numeric (001027); MTR uses 3-letter station codes (CEN, TST); Light Rail uses 3-letter stop codes.
mtr_linestringRequired for stop_eta on MTR. The MTR Next Train API needs both station and line code.
maxItemsinteger15Maximum records to emit. Default is 15 to keep test runs fast. Set higher for full inventory dumps.
proxyConfigurationobjectno proxyHK Open Data APIs are public — proxy not required. Honoured if you opt in.

Modes

ModeWhat it returns
route_listEvery route operated by the chosen operator.
station_listFull station / stop inventory with bilingual names and (for buses) lat/long coordinates.
route_stopsOrdered stop sequence for a single bus route. Bus-only — for MTR use station_list.
stop_etaReal-time arrival ETAs at a single stop.

Run examples

KMB route catalogue:

{
"mode": "route_list",
"operator": "kmb",
"maxItems": 1000
}

MTR Tseung Kwan O Line — live arrivals at Tiu Keng Leng:

{
"mode": "stop_eta",
"operator": "mtr",
"stop_id": "TIK",
"mtr_line": "TKL"
}

KMB bus 1A — full outbound stop list with bilingual names + lat/long:

{
"mode": "route_stops",
"operator": "kmb",
"route": "1A",
"direction": "outbound",
"maxItems": 100
}

Citybus stop ETAs at Central (Macao Ferry):

{
"mode": "stop_eta",
"operator": "citybus",
"stop_id": "001027"
}

Hong Kong Transit Scraper Output Fields

Every record carries a record_type. The unused fields for that record type are empty strings or null.

record_typeModeOperator
routeroute_listall
stationstation_listall
route_stoproute_stopsKMB, Citybus
etastop_etaall

Route record example (KMB)

{
"record_type": "route",
"operator": "KMB",
"service_type": "bus",
"route_number": "1A",
"route_origin_en": "STAR FERRY",
"route_origin_zh": "尖沙咀碼頭",
"route_dest_en": "SAU MAU PING (CENTRAL)",
"route_dest_zh": "中秀茂坪",
"direction": "inbound",
"service_class": "1",
"source_url": "https://data.etabus.gov.hk/v1/transport/kmb/route/",
"scraped_at": "2026-05-02T16:56:14.723Z"
}

Station record example (MTR)

{
"record_type": "station",
"operator": "MTR",
"service_type": "metro",
"line_code": "TKL",
"line_name": "Tseung Kwan O Line",
"direction": "DT",
"station_code": "NOP",
"station_name_en": "North Point",
"station_name_zh": "北角",
"sequence": 1,
"source_url": "https://opendata.mtr.com.hk/data/mtr_lines_and_stations.csv",
"scraped_at": "2026-05-02T16:56:14.723Z"
}

ETA record example (MTR live)

{
"record_type": "eta",
"operator": "MTR",
"service_type": "metro",
"line_code": "TKL",
"line_name": "Tseung Kwan O Line",
"station_code": "TIK",
"direction": "UP",
"eta_seq": 1,
"eta_time": "2026-05-03 00:36:28",
"eta_minutes": 3,
"eta_destination_en": "POA",
"platform": "3",
"source_url": "https://rt.data.gov.hk/v1/transport/mtr/getSchedule.php?line=TKL&sta=TIK",
"scraped_at": "2026-05-02T16:56:14.723Z"
}

Field reference

FieldTypeDescription
record_typestringroute, station, route_stop, or eta.
operatorstringMTR, MTR_LR, KMB, or CTB.
service_typestringmetro / light_rail / bus / airport_express.
route_numberstringBus route number or MTR line code.
route_origin_en / route_origin_zhstringOrigin terminus (English / Traditional Chinese).
route_dest_en / route_dest_zhstringDestination terminus.
directionstringoutbound / inbound for buses; UP / DOWN for MTR.
service_classstringKMB service variant. 1 is the primary route; higher numbers are alternates.
line_code / line_namestringMTR line identifier and full name.
station_codestringOperator-specific stop identifier.
station_name_en / station_name_zhstringStop / station name.
sequencenumberStop sequence on a route (1-based).
latitude / longitudenumberWGS84 coordinates. Buses only — MTR catalogue does not publish coordinates.
eta_seqnumberETA index (1, 2, 3 for the next three arrivals).
eta_timestringETA timestamp (ISO 8601 / HKT).
eta_minutesnumberMinutes to next arrival.
eta_destination_en / eta_destination_zhstringArriving service's destination.
platformstringMTR platform number.
remarks_en / remarks_zhstringOperator-published remark (e.g. "Bus is full", "Service ended").
data_timestampstringOperator-published data timestamp.
source_urlstringSource endpoint the record was derived from.
scraped_atstringISO 8601 timestamp when the record was scraped.

FAQ

How do I scrape Hong Kong transit data?

Pick a mode (route_list, station_list, route_stops, stop_eta) and an operator (mtr, mtr_lr, kmb, citybus), then run. The scraper hits the right HKSAR open-data endpoint, normalizes the response, and emits flat records.

Does Hong Kong Transit Scraper need an API key?

No. The HKSAR open-data programme publishes these endpoints for free public use. No registration, no token, no rate-limit headers.

Does the scraper return fares?

Not in v1. Neither KMB nor Citybus expose fares via the open APIs, and MTR fares are a separate closed download. Fare modes are out of scope for this version.

What about Star Ferry, First Ferry, TurboJET, or HK Tramways?

Out of scope. Star Ferry, First Ferry, and TurboJET publish PDF schedules (no structured API). HK Tramways has no public data feed at all. Adding them would require HTML scraping that is materially more work than the API-only surface.

How do I find a stop ID for stop_eta?

Run station_list mode for the operator first. The station_code field on each record is the ID you pass to stop_eta. For MTR, also note the line_codestop_eta on MTR needs both.

How fresh is the data?

ETAs are live (~30 second refresh). Route and stop catalogues are refreshed daily by KMB and Citybus, quarterly by MTR.

How many records does a full run produce?

Per operator at full inventory: KMB ~6,000 stops or ~1,600 route-direction-variants; Citybus ~3,000 stops or ~400 routes; MTR 99 stations × 2 directions = ~270 station rows; Light Rail ~70 stops. Set maxItems accordingly — default 15 keeps test runs fast.


Need More Features?

Need ferry coverage, fare data, journey-planning A→B routing, or a GTFS-RT export? File an issue or get in touch.

Why Use Hong Kong Transit Scraper?

  • Free APIs, low cost — pay-per-event pricing, ~$0.0008 per record at the default coefficient. A full Hong Kong network catalogue dump costs less than a single Octopus tap.
  • Bilingual & normalized — every record carries both English and Traditional Chinese names. Field names are consistent across MTR, KMB, Citybus, and Light Rail.
  • Stable — no headless browser, no captcha solver, no scraping-the-DOM heuristics. Just public HKSAR open-data endpoints over HTTPS.