Baseball Savant Statcast Pitch Data Scraper avatar

Baseball Savant Statcast Pitch Data Scraper

Pricing

Pay per event

Go to Apify Store
Baseball Savant Statcast Pitch Data Scraper

Baseball Savant Statcast Pitch Data Scraper

Scrapes pitch-level Statcast data from Baseball Savant: exit velocity, launch angle, xwOBA, barrel rate, spin rate, and more. Outputs one row per tracked pitch event. Optionally fetches expected-stats leaderboards (xBA, xwOBA, barrel%). The canonical dataset for modern sabermetric analysis.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Extracts pitch-level Statcast data from Baseball Savant (baseballsavant.mlb.com). One row per tracked pitch event: exit velocity, launch angle, xwOBA, barrel rate, spin rate, and more. Optionally also pulls the expected-stats leaderboard for per-player season aggregates.

This is the dataset Python analysts currently pull via pybaseball. The scraper handles the large CSV downloads, chunking, and field coercion so you don't have to.

What It Does

  • Fetches Statcast pitch-level data from the statcast_search CSV endpoint
  • Covers batter data, pitcher data, or both — configurable per run
  • Filters by team abbreviation (e.g., NYY, LAD) to keep individual requests manageable
  • Optionally pulls the expected-stats leaderboard (xBA, xwOBA per player, season aggregates)
  • Returns clean, typed JSON with consistent field names

Use Cases

  • Sabermetric analysts building pitch-mix models, exit velocity distributions, or barrel-rate dashboards
  • Fantasy/DFS players tracking real-time statcast trends to identify undervalued hitters or pitchers
  • Sports betting models that use xwOBA and spin rate as input features
  • Researchers studying pitch design, swing mechanics, or predictive batting metrics
  • Data engineers building baseball datasets without managing throttling or CSV parsing

How It Works

  1. Input your target season, player type (batter/pitcher/both), and optionally a team abbreviation
  2. The scraper fetches the Baseball Savant CSV endpoint with a browser user agent
  3. Parses the CSV row by row (RFC 4180-compliant, handles BOM and quoted fields)
  4. Coerces all numeric fields and returns one record per pitch event
  5. Optionally fetches the expected-stats leaderboard as a second pass

No proxies required. Baseball Savant is an open, public endpoint run by MLB.

Input

FieldTypeDefaultDescription
seasoninteger2024MLB season year to scrape
playerTypestringbatterWhether to fetch batter, pitcher, or both
modestringstatcastData mode: statcast (pitch-level), leaderboard (season aggregates), or both
teamAbbrevstring(all teams)Optional 3-letter team abbreviation to restrict the query (e.g. NYY, LAD, BOS)
minPainteger50Minimum plate appearances for the leaderboard mode
maxItemsinteger15Maximum records to return. Set to 0 for unlimited (full season = 700k+ rows for all teams)

Example — 2024 NYY batters, statcast mode:

{
"season": 2024,
"playerType": "batter",
"mode": "statcast",
"teamAbbrev": "NYY",
"maxItems": 1000
}

Example — full league expected stats leaderboard:

{
"season": 2024,
"playerType": "batter",
"mode": "leaderboard",
"minPa": 300,
"maxItems": 0
}

Output

Statcast mode

One record per pitch event. Large seasons can return 700,000+ rows for all teams.

{
"player_name": "Judge, Aaron",
"batter_id": 592450,
"pitcher_id": 519242,
"game_date": "2024-09-28",
"pitch_type": "FF",
"release_speed": 95.2,
"launch_speed": 109.8,
"launch_angle": 28,
"estimated_ba": 0.732,
"estimated_woba": 0.891,
"barrel": 1,
"hit_distance": 423,
"spin_rate": 2241,
"events": "home_run",
"description": "hit_into_play",
"zone": 5,
"balls": 1,
"strikes": 1,
"source": "statcast"
}
FieldTypeDescription
player_namestringBatter name (Last, First)
batter_idintegerMLBAM batter player ID
pitcher_idintegerMLBAM pitcher player ID
game_datestringDate of the game (YYYY-MM-DD)
pitch_typestringPitch type abbreviation (FF = four-seam fastball, SL = slider, CH = changeup, etc.)
release_speednumberPitch release speed in mph
launch_speednumberExit velocity off the bat in mph
launch_anglenumberLaunch angle in degrees
estimated_banumberExpected batting average (xBA) based on exit velocity and launch angle
estimated_wobanumberExpected weighted on-base average (xwOBA)
barrelintegerBarrel flag — 1 if the batted ball meets the barrel threshold, 0 otherwise
hit_distancenumberProjected hit distance in feet
spin_ratenumberPitch spin rate in RPM
eventsstringPlate appearance result (single, home_run, strikeout, walk, etc.)
descriptionstringPitch result (called_strike, ball, swinging_strike, hit_into_play, etc.)
zoneintegerStrike zone location (1-9 in zone, 11-14 outside)
ballsintegerBall count at time of pitch
strikesintegerStrike count at time of pitch
sourcestringAlways statcast in this mode

Leaderboard mode

One record per player per season. Typically 250-500 rows depending on the minimum plate appearances filter.

{
"player_name": "Duran, Jarren",
"batter_id": 680776,
"pitcher_id": null,
"game_date": null,
"pitch_type": null,
"release_speed": null,
"launch_speed": null,
"launch_angle": null,
"estimated_ba": 0.267,
"estimated_woba": 0.34,
"barrel": null,
"hit_distance": null,
"spin_rate": null,
"events": null,
"description": null,
"zone": null,
"balls": null,
"strikes": null,
"source": "leaderboard"
}

The leaderboard shares the same schema as statcast mode — pitch-specific fields are null for leaderboard rows. estimated_ba and estimated_woba reflect season aggregates.

🔍 FAQ

How do I scrape statcast data from Baseball Savant? Baseball Savant Statcast Pitch Data Scraper hits the public statcast_search/csv endpoint directly. Set your season, filter by team if needed, and set maxItems to cap the result set for testing.

How much does this actor cost to run? Baseball Savant Statcast Pitch Data Scraper uses pay-per-event pricing at the standard rate. A full team season (~25,000 pitch events) is inexpensive. A full league season (700,000+ rows) costs more — consider filtering by team or adding a maxItems cap if you only need a sample.

What data can I get from Baseball Savant statcast? Baseball Savant Statcast Pitch Data Scraper returns pitch velocity, launch angle, exit velocity, xBA, xwOBA, barrel flag, spin rate, zone, count, and game context — the core Statcast fields analysts use for modern baseball modeling.

Can I filter by team? Set teamAbbrev to any standard 3-letter MLB abbreviation (NYY, LAD, BOS, HOU, etc.) to pull only pitches involving that team's batters or pitchers. Useful when you don't need the full league dataset.

Does this actor need proxies? Baseball Savant Statcast Pitch Data Scraper does not require proxies. The endpoint is public and accessible with a standard browser user agent.

Why Use Baseball Savant Statcast Pitch Data Scraper?

  • No setup overhead — pybaseball requires a Python environment, rate limiting awareness, and manual CSV handling. This actor returns structured JSON in one click.
  • Team-level filtering — pull a specific team's pitch data instead of waiting for a full-league 17 MB CSV to download.
  • Two modes in one — pitch-level statcast detail and per-player expected-stats leaderboard from the same actor with the same output schema.

Need More Features?

Need additional Statcast fields, date-range filtering, or a specific leaderboard type? File an issue or get in touch.