Baseball Savant Statcast Pitch Data Scraper
Pricing
Pay per event
Baseball Savant Statcast Pitch Data Scraper
Scrapes pitch-level Statcast data from Baseball Savant: exit velocity, launch angle, xwOBA, barrel rate, spin rate, and more. Outputs one row per tracked pitch event. Optionally fetches expected-stats leaderboards (xBA, xwOBA, barrel%). The canonical dataset for modern sabermetric analysis.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Extracts pitch-level Statcast data from Baseball Savant (baseballsavant.mlb.com). One row per tracked pitch event: exit velocity, launch angle, xwOBA, barrel rate, spin rate, and more. Optionally also pulls the expected-stats leaderboard for per-player season aggregates.
This is the dataset Python analysts currently pull via pybaseball. The scraper handles the large CSV downloads, chunking, and field coercion so you don't have to.
What It Does
- Fetches Statcast pitch-level data from the
statcast_searchCSV endpoint - Covers batter data, pitcher data, or both — configurable per run
- Filters by team abbreviation (e.g., NYY, LAD) to keep individual requests manageable
- Optionally pulls the expected-stats leaderboard (xBA, xwOBA per player, season aggregates)
- Returns clean, typed JSON with consistent field names
Use Cases
- Sabermetric analysts building pitch-mix models, exit velocity distributions, or barrel-rate dashboards
- Fantasy/DFS players tracking real-time statcast trends to identify undervalued hitters or pitchers
- Sports betting models that use xwOBA and spin rate as input features
- Researchers studying pitch design, swing mechanics, or predictive batting metrics
- Data engineers building baseball datasets without managing throttling or CSV parsing
How It Works
- Input your target season, player type (batter/pitcher/both), and optionally a team abbreviation
- The scraper fetches the Baseball Savant CSV endpoint with a browser user agent
- Parses the CSV row by row (RFC 4180-compliant, handles BOM and quoted fields)
- Coerces all numeric fields and returns one record per pitch event
- Optionally fetches the expected-stats leaderboard as a second pass
No proxies required. Baseball Savant is an open, public endpoint run by MLB.
Input
| Field | Type | Default | Description |
|---|---|---|---|
season | integer | 2024 | MLB season year to scrape |
playerType | string | batter | Whether to fetch batter, pitcher, or both |
mode | string | statcast | Data mode: statcast (pitch-level), leaderboard (season aggregates), or both |
teamAbbrev | string | (all teams) | Optional 3-letter team abbreviation to restrict the query (e.g. NYY, LAD, BOS) |
minPa | integer | 50 | Minimum plate appearances for the leaderboard mode |
maxItems | integer | 15 | Maximum records to return. Set to 0 for unlimited (full season = 700k+ rows for all teams) |
Example — 2024 NYY batters, statcast mode:
{"season": 2024,"playerType": "batter","mode": "statcast","teamAbbrev": "NYY","maxItems": 1000}
Example — full league expected stats leaderboard:
{"season": 2024,"playerType": "batter","mode": "leaderboard","minPa": 300,"maxItems": 0}
Output
Statcast mode
One record per pitch event. Large seasons can return 700,000+ rows for all teams.
{"player_name": "Judge, Aaron","batter_id": 592450,"pitcher_id": 519242,"game_date": "2024-09-28","pitch_type": "FF","release_speed": 95.2,"launch_speed": 109.8,"launch_angle": 28,"estimated_ba": 0.732,"estimated_woba": 0.891,"barrel": 1,"hit_distance": 423,"spin_rate": 2241,"events": "home_run","description": "hit_into_play","zone": 5,"balls": 1,"strikes": 1,"source": "statcast"}
| Field | Type | Description |
|---|---|---|
player_name | string | Batter name (Last, First) |
batter_id | integer | MLBAM batter player ID |
pitcher_id | integer | MLBAM pitcher player ID |
game_date | string | Date of the game (YYYY-MM-DD) |
pitch_type | string | Pitch type abbreviation (FF = four-seam fastball, SL = slider, CH = changeup, etc.) |
release_speed | number | Pitch release speed in mph |
launch_speed | number | Exit velocity off the bat in mph |
launch_angle | number | Launch angle in degrees |
estimated_ba | number | Expected batting average (xBA) based on exit velocity and launch angle |
estimated_woba | number | Expected weighted on-base average (xwOBA) |
barrel | integer | Barrel flag — 1 if the batted ball meets the barrel threshold, 0 otherwise |
hit_distance | number | Projected hit distance in feet |
spin_rate | number | Pitch spin rate in RPM |
events | string | Plate appearance result (single, home_run, strikeout, walk, etc.) |
description | string | Pitch result (called_strike, ball, swinging_strike, hit_into_play, etc.) |
zone | integer | Strike zone location (1-9 in zone, 11-14 outside) |
balls | integer | Ball count at time of pitch |
strikes | integer | Strike count at time of pitch |
source | string | Always statcast in this mode |
Leaderboard mode
One record per player per season. Typically 250-500 rows depending on the minimum plate appearances filter.
{"player_name": "Duran, Jarren","batter_id": 680776,"pitcher_id": null,"game_date": null,"pitch_type": null,"release_speed": null,"launch_speed": null,"launch_angle": null,"estimated_ba": 0.267,"estimated_woba": 0.34,"barrel": null,"hit_distance": null,"spin_rate": null,"events": null,"description": null,"zone": null,"balls": null,"strikes": null,"source": "leaderboard"}
The leaderboard shares the same schema as statcast mode — pitch-specific fields are null for leaderboard rows. estimated_ba and estimated_woba reflect season aggregates.
🔍 FAQ
How do I scrape statcast data from Baseball Savant?
Baseball Savant Statcast Pitch Data Scraper hits the public statcast_search/csv endpoint directly. Set your season, filter by team if needed, and set maxItems to cap the result set for testing.
How much does this actor cost to run?
Baseball Savant Statcast Pitch Data Scraper uses pay-per-event pricing at the standard rate. A full team season (~25,000 pitch events) is inexpensive. A full league season (700,000+ rows) costs more — consider filtering by team or adding a maxItems cap if you only need a sample.
What data can I get from Baseball Savant statcast? Baseball Savant Statcast Pitch Data Scraper returns pitch velocity, launch angle, exit velocity, xBA, xwOBA, barrel flag, spin rate, zone, count, and game context — the core Statcast fields analysts use for modern baseball modeling.
Can I filter by team?
Set teamAbbrev to any standard 3-letter MLB abbreviation (NYY, LAD, BOS, HOU, etc.) to pull only pitches involving that team's batters or pitchers. Useful when you don't need the full league dataset.
Does this actor need proxies? Baseball Savant Statcast Pitch Data Scraper does not require proxies. The endpoint is public and accessible with a standard browser user agent.
Why Use Baseball Savant Statcast Pitch Data Scraper?
- No setup overhead — pybaseball requires a Python environment, rate limiting awareness, and manual CSV handling. This actor returns structured JSON in one click.
- Team-level filtering — pull a specific team's pitch data instead of waiting for a full-league 17 MB CSV to download.
- Two modes in one — pitch-level statcast detail and per-player expected-stats leaderboard from the same actor with the same output schema.
Need More Features?
Need additional Statcast fields, date-range filtering, or a specific leaderboard type? File an issue or get in touch.