Hong Kong Jockey Club (HKJC) Comprehensive Racing Data Scraper avatar

Hong Kong Jockey Club (HKJC) Comprehensive Racing Data Scraper

Pricing

from $15.00 / 1,000 results

Go to Apify Store
Hong Kong Jockey Club (HKJC) Comprehensive Racing Data Scraper

Hong Kong Jockey Club (HKJC) Comprehensive Racing Data Scraper

The definitive data solution for Hong Kong horse racing. Effortlessly extract comprehensive datasets including Pre Race Analysis, Race Results, Barrier Trials, Veterinary history, and Official Incident Reports. Engineered for high-speed performance and clean, developer-friendly JSON output.

Pricing

from $15.00 / 1,000 results

Rating

0.0

(0)

Developer

Alaricus

Alaricus

Maintained by Community

Actor stats

3

Bookmarked

28

Total users

3

Monthly active users

18 days ago

Last modified

Share

HKJC Scraper Actor

What does the HKJC Comprehensive Racing Data Scraper do?

The HKJC Comprehensive Racing Data Scraper is an all-in-one intelligence tool designed to extract high-fidelity data from the official Hong Kong Jockey Club platform. It provides professional-grade access to the world’s most lucrative horse racing ecosystem, covering everything from historical race results to deep medical veterinary history.

This Actor is built for professional handicappers, data scientists, and betting syndicate developers who require structured, normalized JSON data to power their predictive models and racing dashboards in 2026.

Key Features

  • Multi-Service Architecture: Access five distinct data streams within a single Actor:

    • Race Results: Comprehensive finishing data, sectional times, and margins.
    • Pre-Race Analysis: Advanced predictive metrics including 400m sectional rankings, closing momentum profiles, "Fatigue" scores, and full historical sectional times for upcoming races.
    • Incident Reports: Official steward reports and racing incidents for performance analysis.
    • Barrier Trials: Morning trial performance data and results.
    • Veterinary Records: Complete health history including lame records, surgeries, and "passed" dates.
  • One-Click Bulk Extraction: Features "Scrape ALL available dates" toggles to automatically harvest years of historical data without manual date entry.

  • Targeted Veterinary Search: Specific lookup mode for "Today's Declared Starters" or deep-history search by Horse Brand Numbers (e.g., K458).

Why scrape HKJC data?

Hong Kong horse racing is a data-driven sport. This tool provides the "Home of Truth" data necessary for:

  • Algorithmic Betting: Feed clean, historical datasets into ELO or machine learning models.
  • Horse Health Monitoring: Track veterinary history and "passed" trials to identify horses returning from injury.
  • Jockey & Trainer Analysis: Monitor weight allowances and performance trends across Sha Tin and Happy Valley.
  • Steward Insights: Extract incident reports to find "unlucky" runners who faced interference but didn't place.

How to use the Scraper

  1. Select Service Mode: Choose between Results, Pre-Race Analysis, Reports, Trials, or Veterinary.
  2. Configure Date Logic:
    • Bulk Mode: Check the "Scrape ALL available dates" toggle to automatically discover and extract every date currently listed on the HKJC dropdowns (typically 150-200+ dates).
    • Targeted Mode: Provide a specific list of dates in YYYY-MM-DD format.
  3. Advanced Filters (Optional):
    • Race Numbers: Leave empty for all races, or specify (e.g., 1, 8, 10) for targeted race analysis.
    • Brand Numbers: For Veterinary mode, enter specific IDs (e.g., H095) to get full lifetime medical history.
  4. Click Run: The scraper handles the navigation, dropdown selection, and table extraction.
  5. Download Data: Export your structured dataset in JSON, CSV, or Excel.

Input Parameters

The parameters are organized into logical sections. Only the fields related to your selected Mode will be processed.

🛠 Input Configuration & Mode Mapping

The scraper is highly modular. Depending on the mode you select, different input parameters become active. Use the table below to configure your JSON input or UI settings.

Service Mode (mode)Parameter KeyTypeDescriptionOptions / Format
Race Results (result)all_result_dates
result_dates
result_race_numbers
Boolean
Array
Array
Toggle bulk extraction.
Target specific dates.
Target specific races.
true / false
["YYYY-MM-DD"]
[1, 5] (Empty = ALL)
Pre-Race Analysis (prerace_analysis)prerace_race_numbersArrayTarget specific upcoming races.[1, 5] (Empty = ALL)
Incident Reports (report)all_report_dates
report_dates
Boolean
Array
Toggle bulk extraction.
Target dates for reports.
true / false
["YYYY-MM-DD"]
Barrier Trials (barrier_trial)all_barrier_dates
barrier_dates
Boolean
Array
Toggle bulk extraction.
Target dates for trials.
true / false
["YYYY-MM-DD"]
Veterinary Records (veterinary)get_starters
get_full_database
brand_numbers
Boolean
Boolean
Array
Scrape upcoming runners.
Enable history search.
Target Horse IDs.
true / false
true / false
["K458", "H095"]

Pro Tip: When using the API, ensure your mode matches the parameters you are sending. For example, if mode is set to report, the scraper will ignore race_numbers.


Input and Output Examples

📊 PRE-RACE ANALYSIS MODE

Example Input

  1. The "Upcoming Card" Analysis Automatically pulls upcoming race data and calculates momentum and fatigue profiles based on historical sectional times.
{
"mode": "prerace_analysis"
}
  1. Targeted "Feature Race" Analysis Specify an exact race number to pull historical form metrics just for that race.
{
"mode": "prerace_analysis",
"prerace_race_numbers": [7]
}

Example Output

The output would be a list of dictionaries, such as the following:

{
"mode": "prerace_analysis",
"race_no": 1,
"date": "2026-04-26",
"racecourse": "SHA TIN",
"track": "TURF",
"course": "A Course",
"race_class": "Class 4",
"rating_range": "060-040",
"distance": 1200,
"race_name": "FWD INSURANCE ACT PRIVATE HANDICAP",
"post_time": "12:30 PM",
"horse_name": "BEAUTY GEMINI",
"brand_no": "L129",
"runner_no": 12,
"rank": null,
"fitness_rating": 1,
"weight": 122,
"body_weight": 1124,
"jockey": "Y L Chung",
"trainer": "A S Cruz",
"horse_comments": "",
"draw": 4,
"age": 3,
"horse_url": "https://racing.hkjc.com/en-us/local/information/horse?horseid=HK_2025_L129",
"horse_id": "L129",
"last_run_date": "2026-01-25",
"last_run_energy": 70,
"last_run_wide": "",
"last_run_venue": "SHA TIN",
"last_run_course": "A+3",
"last_run_distance": 1000,
"last_run_going": "GOOD",
"last_run_finish_pos": 12,
"last_run_draw": 4,
"last_run_weight": 127,
"last_run_body_weight": 1110,
"last_run_jockey": "A Badel",
"last_run_odds": 118.0,
"last_run_total_starters": 14,
"last_run_pace": "Good",
"last_run_comments": "Pace Good; Jumped awkwardly, positioned 1 to 2 lengths behind the leader on the middle track, peaked 300M, weakened thereafter.",
"last_run_race_url": "https://racing.hkjc.com/en-us/local/information/localresults?racedate=2026/01/25&Racecourse=ST&RaceNo=1",
"last_run_race_no": 1,
"last_run_track_type": "TURF",
"last_run_class": 4,
"last_run_lbw": 10.25,
"last_run_lbw_raw": "10-1/4",
"last_run_gear_raw": "TT",
"last_run_gear": [
{
"equipment": "TONGUE_TIE",
"action": "RETAINED"
}
],
"last_run_finish_time": 58.37,
"last_run_sec1_pos": 5,
"last_run_sec1_margin": 1.5,
"last_run_sec1_margin_raw": "1-1/2",
"last_run_sec1_time_rank": 5,
"last_run_sec2_pos": 9,
"last_run_sec2_margin": 2.0,
"last_run_sec2_margin_raw": "2",
"last_run_sec2_time_rank": 11,
"last_run_sec3_pos": 12,
"last_run_sec3_margin": 10.25,
"last_run_sec3_margin_raw": "10-1/4",
"last_run_sec3_time_rank": 12,
"last_run_position_change": -3,
"last_run_closing_profile": "Faded",
"last_run_sec1_time_total": 13.16,
"last_run_sec1_time_split1": null,
"last_run_sec1_time_split2": null,
"last_run_sec2_time_total": 21.13,
"last_run_sec2_time_split1": 10.25,
"last_run_sec2_time_split2": 10.88,
"last_run_sec3_time_total": 24.08,
"last_run_sec3_time_split1": 11.46,
"last_run_sec3_time_split2": 12.62,
"last_run_final_200m_fatigue": 1.16,
"last_run_fatigue_profile": "Collapsing",
"2nd_last_run_date": "2026-01-01",
"2nd_last_run_energy": 82,
"2nd_last_run_wide": "2W3W",
"2nd_last_run_venue": "SHA TIN",
"2nd_last_run_course": "B+2",
"2nd_last_run_distance": 1200,
"2nd_last_run_going": "GOOD",
"2nd_last_run_finish_pos": 14,
"2nd_last_run_draw": 7,
"2nd_last_run_weight": 128,
"2nd_last_run_body_weight": 1099,
"2nd_last_run_jockey": "Z Purton",
"2nd_last_run_odds": 5.0,
"2nd_last_run_total_starters": 14,
"2nd_last_run_pace": "Good to slow",
"2nd_last_run_comments": "Pace Good to slow; (2W3W) Sluggish start, settled in rear of mid-field, forced to steady on heels 550m out, left behind home straight.",
"2nd_last_run_race_url": "https://racing.hkjc.com/en-us/local/information/localresults?racedate=2026/01/01&Racecourse=ST&RaceNo=2",
"2nd_last_run_race_no": 2,
"2nd_last_run_track_type": "TURF",
"2nd_last_run_class": 4,
"2nd_last_run_lbw": 5.0,
"2nd_last_run_lbw_raw": "5",
"2nd_last_run_gear_raw": "TT1",
"2nd_last_run_gear": [
{
"equipment": "TONGUE_TIE",
"action": "FIRST_TIME"
}
],
"2nd_last_run_finish_time": 70.98,
"2nd_last_run_sec1_pos": 10,
"2nd_last_run_sec1_margin": 4.0,
"2nd_last_run_sec1_margin_raw": "4",
"2nd_last_run_sec1_time_rank": 10,
"2nd_last_run_sec2_pos": 12,
"2nd_last_run_sec2_margin": 4.25,
"2nd_last_run_sec2_margin_raw": "4-1/4",
"2nd_last_run_sec2_time_rank": 12,
"2nd_last_run_sec3_pos": 14,
"2nd_last_run_sec3_margin": 5.0,
"2nd_last_run_sec3_margin_raw": "5",
"2nd_last_run_sec3_time_rank": 13,
"2nd_last_run_position_change": -2,
"2nd_last_run_closing_profile": "Faded",
"2nd_last_run_sec1_time_total": 24.62,
"2nd_last_run_sec1_time_split1": null,
"2nd_last_run_sec1_time_split2": null,
"2nd_last_run_sec2_time_total": 22.69,
"2nd_last_run_sec2_time_split1": 11.3,
"2nd_last_run_sec2_time_split2": 11.39,
"2nd_last_run_sec3_time_total": 23.67,
"2nd_last_run_sec3_time_split1": 11.38,
"2nd_last_run_sec3_time_split2": 12.29,
"2nd_last_run_final_200m_fatigue": 0.91,
"2nd_last_run_fatigue_profile": "Collapsing"
}

🧠 Understanding Pre-Race Predictive Metrics

The Pre-Race Analysis mode is designed to give you an edge by automatically fetching and calculating historical sectional times for up to 7(sometimes less if the horse has not run that many times) of a horse's most recent runs. This allows you to evaluate how a horse ran its past races, rather than just where it finished.

Here are the most powerful fields to look out for:

1. Sectional Time Ranks (secX_time_rank)

Instead of manually comparing raw sectional times, the scraper calculates how fast a horse ran a specific section relative to the other horses in that exact same race.

  • Example: "last_run_sec3_time_rank": 1 means the horse ran the fastest final section of anyone in that race, regardless of its overall finishing position.

2. Position Change & Closing Profiles

This metric measures how many horses a runner passed (or was passed by) in the critical final 400m stretch: Position at 400m - Position at Finish.

closing_profileValue Range (position_change)Description
Strong Closer> +4Passed 5 or more horses in the final 400m. A massive engine late in the race.
Closer+2 to +4Made up solid ground in the straight.
Even-1 to +1Held its position to the line.
Faded-4 to -2Lost ground and was passed by a few horses.
Quitter< -4Dropped 5 or more positions in the final 400m. Hit a wall.

3. Momentum Indicator: Final 200m Fatigue

In Hong Kong, the final 400m is split into two 200m halves (split1 and split2). The Fatigue metric calculates the difference between these halves: Split 2 - Split 1. It is a true physics-based indicator of deceleration.

fatigue_profileValue (final_200m_fatigue)Description
Sprinting< 0.0sThe horse accelerated! The final 200m was run faster than the previous 200m. Elite finish.
Sustained0.0s to 0.3sMaintained momentum perfectly through the line with minimal deceleration.
Tiring0.3s to 0.7sStandard deceleration. The horse was working hard but starting to slow.
Collapsing> 0.7sThe horse completely emptied out and hit a physical wall before the finish line.

Pro Tip: Combine these metrics to find hidden gems! A horse might finish 8th and look bad on traditional form guides, but if it had a position_change of +6 (Strong Closer) and a fatigue_profile of "Sprinting", it was flying home late and might be a massive value in its next start!

🏁RESULT MODE

Example Input

  1. The "Historical Harvest" (Bulk Mode) Best for users building a new database who want to scrape every race result currently listed on the HKJC website in one go.
{
"mode": "result",
"all_result_dates": true
}
  1. The "Targeted Handicapper" (Specific Races) Ideal for analyzing specific races on a given day (e.g., just the Feature Race or the first leg of a Triple Trio).
{
"mode": "result",
"result_dates": ["2026-03-15"],
"race_numbers": [1, 5]
}
  1. The "Full Meeting" Scrape The standard way to get every result from a specific race day. By omitting race_numbers, the scraper automatically processes all races for that date.
{
"mode": "result",
"result_dates": ["2026-03-15", "2026-03-18"]
}

Example Output

The output would be a list of dictionaries, such as the following:

{
"race_number": 2,
"season_race": 516,
"venue": "SHA TIN",
"date": "2026-03-15",
"race_title": "SOUTH WALL HANDICAP",
"going": "GOOD TO FIRM",
"track_type": "TURF",
"track_type_status": "C+3 Course",
"class": "Class 5",
"distance": 1200,
"max_rating": 40,
"min_rating": 0,
"prize": 875000,
"race_time": [
23.96,
46.47,
69.22
],
"race_sectional_time": [
{
"main": 23.96,
"subs": []
},
{
"main": 22.51,
"subs": [
11.13,
11.38
]
},
{
"main": 22.75,
"subs": [
11.21,
11.54
]
}
],
"place": 9,
"status": "FINISHED",
"horse_number": 11,
"horse_name": "VERBIER",
"horse_id": "J187",
"horse_url": "https://racing.hkjc.com/en-us/local/information/horse?horseid=HK_2023_J187",
"gear": [
{
"equipment": "BLINKERS",
"action": "RETAINED"
},
{
"equipment": "TONGUE_TIE",
"action": "RETAINED"
}
],
"jockey_name": "L Ferraris",
"jockey_id": "FEL",
"jockey_url": "https://racing.hkjc.com/en-us/local/information/jockeyprofile?jockeyid=FEL&Season=Current",
"trainer_name": "C Fownes",
"trainer_id": "FC",
"trainer_url": "https://racing.hkjc.com/en-us/local/information/trainerprofile?trainerid=FC&Season=Current",
"actual_weight": 125,
"declared_horse_weight": 1261,
"draw": 8,
"length_behind_winner_raw": "5-3/4",
"length_behind_winner": 5.75,
"running_position": [
11,
12,
9
],
"finish_time_raw": "1:10.14",
"finish_time": 70.14,
"win_odds": 15.0,
"sectional_time": [
{
"position": 11,
"margin_raw": "6-1/2",
"margin": 6.5,
"time": 25.0,
"sub_splits": []
},
{
"position": 12,
"margin_raw": "5-3/4",
"margin": 5.75,
"time": 22.39,
"sub_splits": [
10.89,
11.5
]
},
{
"position": 9,
"margin_raw": "5-3/4",
"margin": 5.75,
"time": 22.75,
"sub_splits": [
11.13,
11.62
]
}
],
"comment": "Waited with towards rear, dropped to last place top of home straight, no impression."
}

📋REPORT MODE

Example Input

  1. The "Historical Archive" (Bulk Reports) Automatically finds and extracts every Steward's Incident Report currently available in the HKJC dropdown archive. This is ideal for building a comprehensive database of racing interference and track incidents.
{
"mode": "report",
"all_report_dates": true
}
  1. Targeted Meeting Reports Scrapes official incident reports for a specific date (or set of dates). Use this to analyze the steward's findings immediately after a race meeting has concluded.
{
"mode": "report",
"report_dates": ["2026-03-15", "2026-03-18"]
}

Example Output

The output would be a dictionary, such as the following:

{
"date": "2026-03-15",
"start_time": "12:30",
"venue": "SHA TIN",
"track_type": "MIXED",
"track_type_status": "AWT / TURF C+3 COURSE",
"going_details": [
{
"status": "GOOD",
"races": [
1,
5,
8
]
},
{
"status": "GOOD TO FIRM",
"races": [
2,
3,
4,
6,
7,
9,
10,
11
]
}
],
"penetrometer": [
{
"value": 2.71,
"time": "08:00 am"
},
{
"value": 2.71,
"time": "11:30 am"
}
],
"clegg_hammer": [
{
"value": 9.06,
"time": "08:00 am"
},
{
"value": 9.08,
"time": "11:30 am"
}
],
"stewards": [
{
"name": "Dr Henry H L Chan",
"position": "Chairman"
},
{
"name": "Mr Benedict Sin",
"position": "Steward"
},
{
"name": "Mr M Van Gestel",
"position": "Chief Stipendiary Steward"
},
{
"name": "Mr T Bailey",
"position": "Stipendiary Steward"
},
{
"name": "Mr T Vassallo",
"position": "Stipendiary Steward"
},
{
"name": "Mr K C Y Kwok",
"position": "Stipendiary Steward"
},
{
"name": "Mr J C H Ho",
"position": "Stipendiary Steward"
}
],
"race_number": 2,
"season_race": 516,
"race_title": "SOUTH WALL HANDICAP",
"section": 1,
"race_class": "Class 5",
"distance": 1200,
"place": 11,
"dead_heat": false,
"horse_number": 1,
"horse": {
"name": "WINNING CIGAR",
"id": "K422",
"url": "https://racing.hkjc.com/en-us/local/information/horse?horseid=HK_2024_K422"
},
"drawn": 7,
"jockey": {
"name": "H Bentley",
"id": "BHW",
"url": "https://racing.hkjc.com/en-us/local/information/jockeyprofile?jockeyid=BHW&Season=Current",
"allowance": 0
},
"incident": "Jumped only fairly. Between the 300 Metres and the 200 Metres had difficulty obtaining clear running."
}

🏇 BARRIER TRIAL MODE

Example Input

  1. Bulk Trial Discovery Fetches every available morning barrier trial result from the archive. Essential for tracking long-term horse preparation patterns and fitness cycles.
{
"mode": "barrier_trial",
"all_barrier_dates": true
}
  1. Recent Form Check Scrapes results for the most recent morning trials. This helps in identifying horses that showed strong speed or "finished strongly" in non-betting trials before their next race.
{
"mode": "barrier_trial",
"barrier_dates": ["2026-04-02"]
}

Example Output

The output would be a dictionary, such as the following:

{
"batch_number": 2,
"venue": "CONGHUA",
"track_type": "ALL WEATHER TRACK",
"distance": 1200,
"going": "GOOD",
"overall_time": 70.88,
"sectional_times": [
24.9,
22.7,
23.2
],
"video_url": "https://racing.hkjc.com/contentAsset/videoplayer_v4/video-player-iframe_v4.html?type=brts&date=20260402&rc=ch&no=02&lang=eng&rf=http://racing.hkjc.com/en-us/local/information/btresult?Date=2026/04/02&pageid=racing/local&jumpTime=47",
"date": "2026-04-02",
"horse_name": "BEAUTY CRESCENT",
"horse_id": "H334",
"horse_url": "https://racing.hkjc.com/en-us/local/information/horse?horseid=HK_2022_H334",
"jockey": "M Kellady",
"trainer": "A S Cruz",
"draw": 2,
"gear": [
{
"equipment": "BLINKERS",
"action": "RETAINED"
}
],
"lbw_raw": "8-1/4L",
"lbw": 8.25,
"running_position": [
4,
4,
4
],
"finish_time_raw": "1.12.21",
"finish_time": 72.21,
"result": "",
"comment": "Limited response when asked; not as expected."
}

🏥VETERINARY MODE

Example Input

  1. Targeted Horse Search (By Brand Number) The fastest way to get the full medical history for specific horses. Simply provide the unique HKJC Brand Numbers.
{
"mode": "veterinary",
"brand_numbers": ["K458", "H095"]
}
  1. Full Veterinary Database Export This is the "Power User" mode. It triggers the scraper to crawl the entire HKJC veterinary database to extract health records for every horse currently registered in the system. More than 700 horses with multiple reports.
{
"mode": "veterinary",
"get_full_database": true
}
  1. "Today's Starters" Health Check Automatically identifies all horses declared for the next upcoming race meeting and retrieves their medical history. Perfect for last-minute filtering of runners with recent health issues.
{
"mode": "veterinary",
"get_starters": true
}

Example Output

The output would be a dictionary, such as the following:

{
"horse_name": "SETANTA",
"horse_id": "G095",
"vet_reports": [
{
"record_date": "2021-09-06",
"record_details": "Castration.",
"passing_date": null
},
{
"record_date": "2025-08-11",
"record_details": "Eight years of age or above at season end.",
"passing_date": "2025-09-02"
}
]
}

💰 Pricing: Pay-Per-Event (PPE)

This Actor uses a transparent Pay-Per-Event pricing model. You only pay for the individual records pushed to the dataset. For Results, Reports, and Trials, data is denormalized so that each horse's performance or incident is a unique, billable record.

  • Price per 1,000 results: $15.00
  • Price per single record (1 Horse): $0.015

📊 Cost Examples:

Service ModeEstimated RecordsEstimated Cost
Full Race Meeting (Results)~150 horse results~$2.25
Full Meeting Incident Reports~140-150 reports~$2.10
Morning Barrier Trials~50 - 100 horses$0.75 – $1.50
Lifetime Veterinary History1 Horse (Full History)$0.015

Record Volume Warning: The total number of records (and thus the cost) depends on the specific event. For example, a Barrier Trial morning may have only 50 horses, while a peak season trial day could have over 100 horses. Similarly, race cards vary between 8 and 11 races per meeting.


💡 Pro Tip: Data Export

Because the data is denormalized (flattened), you can export these results directly to Excel or CSV from the Apify platform. Every row is a complete record containing both the horse's specific data and the meeting's metadata (Date, Venue, etc.), making it ready for immediate analysis in Pandas, Excel, or Google Sheets.

🛠 Troubleshooting

  • Date Formats: Ensure dates are YYYY-MM-DD. If the UI shows a validation error, double-check your format.
  • Empty Results: If you search for "Today's Starters" before the HKJC has officially declared the meeting (usually 48 hours before), the result list will be empty.
  • Race Numbers: If you enter [12] for a meeting that only has 9 races, no data will be returned for that specific race number.

✉️ Feedback & Custom Work

I am committed to the ongoing development of this Actor. Future updates and feature additions (such as Dividend/Payout scraping or Weather analysis) will be prioritized based on user reviews, demand, and specific requests.

  • Support: I actively monitor the performance of this tool to ensure it adapts to HKJC website updates. If you find any data inconsistencies or parsing errors, please open an issue in the Actor's "Issues" tab.
  • Reviews: If this data powers your winning model or saves you hours of manual work, please leave a 5-star review! Your feedback directly influences the development roadmap.
  • Customization: Need a specialized data extractor, a custom database integration, or a unique "Actor-as-a-Product" solution? I am available for custom Actor development.

Contact me: bd.pascari@gmail.com