# MLB StatsAPI Baseball Data Scraper (`parseforge/mlb-statsapi-scraper`) Actor

Tap the official MLB StatsAPI for teams, players, schedules, venues, leagues, divisions, and seasons. Filter by sport id and season year to build rosters, standings, or game calendars. Useful for fantasy tools, baseball analytics, and historical season research.

- **URL**: https://apify.com/parseforge/mlb-statsapi-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Sports, Automation, Integrations
- **Stats:** 7 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## ⚾ MLB StatsAPI Public Scraper

> 🚀 **Export MLB teams, players, schedule, and venues in seconds, straight from the public StatsAPI used by MLB.com.**

> 🕒 **Last updated:** 2026-06-05 · **📊 14 fields** per record · All 30 MLB clubs · Player rosters and bios · Schedule, venues, leagues, divisions

The MLB StatsAPI Public Scraper turns the [statsapi.mlb.com](https://statsapi.mlb.com/api/v1/) public endpoint into a clean dataset. It calls the public API with whichever filters you supply, then flattens each record into one row.

Coverage spans all 30 mlb clubs · player rosters and bios · schedule, venues, leagues, divisions. Each row carries the most useful identifiers, names, scores, and timestamps the upstream feed exposes.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| ⚾ Baseball analysts | Pull rosters and schedules at scale |
| 🏟️ Fantasy MLB builders | Power daily-fantasy and season leagues |
| 📰 Sports journalists | Verify roster and schedule data fast |
| 🎓 Sabermetrics students | Mine official MLB datasets |
| 🤖 ML engineers | Build pitcher-batter prediction sets |
| 👩‍💻 Developers | Mirror StatsAPI into your own store |

### 📋 What the MLB StatsAPI Public Scraper does

- Calls MLB StatsAPI endpoints for teams, players, schedule, and venues.
- Flattens nested venue, league, and division blocks into top-level fields.
- Supports sportId, season, and date range filters.
- Casts numeric ids cleanly.
- Surfaces upstream errors as a clean error record.

> 💡 **Why it matters.** The MLB StatsAPI powers MLB.com itself. This actor turns its sprawling endpoints into a flat dataset.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><code>endpoint</code></td><td>string</td><td>No</td><td>See examples below.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>No</td><td>Free users 10, paid up to 1,000,000. Prefill is 10.</td></tr>
</table>

**Example 1, All MLB teams.**
```json
{ "endpoint": "teams", "sportId": 1, "maxItems": 30 }
````

**Example 2, 2024 schedule.**

```json
{ "endpoint": "schedule", "season": 2024, "maxItems": 100 }
```

> ⚠️ **Good to Know.** StatsAPI is public and unauthenticated. Heavy use can trigger rate limits.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 🆔 `id` | number | Primary id. |
| 🏷️ `name` | string | Name or full name. |
| 🔤 `abbreviation` | string | Short code when present. |
| 📍 `venue` | string | Venue name when present. |
| 🏟️ `league` | string | League name. |
| 🧭 `division` | string | Division name when present. |
| 🌍 `locationName` | string | Location. |
| 📅 `firstYearOfPlay` | string | First year of play when present. |
| 🔗 `link` | string | StatsAPI link. |
| 🆎 `endpoint` | string | Which endpoint was called. |
| 📅 `season` | number | Season when applicable. |
| 📦 `raw` | object | Full upstream record. |
| 🕒 `scrapedAt` | string | When fetched. |
| ❌ `error` | string | Set when upstream returned an error. |

**Sample record.**

```json
{
  "id": 0,
  "name": "",
  "abbreviation": "",
  "venue": "",
  "league": "",
  "division": "",
  "locationName": "",
  "firstYearOfPlay": "",
  "link": "",
  "endpoint": "",
  "season": 0,
  "raw": {},
  "scrapedAt": "2026-06-05T12:00:00.000Z",
  "error": null
}
```

### ✨ Why choose this Actor

| 🆓 | Public MLB StatsAPI endpoint, no scraping tricks needed. |
| 🧹 | Flattens nested upstream payloads into one row per record. |
| 🔢 | Casts numeric fields cleanly for spreadsheet imports. |
| 🛟 | Surfaces upstream errors as clean rows. |
| 🔌 | Filters exposed for the most common slicing needs. |
| 💾 | Push to dataset for spreadsheet, warehouse, or webhook export. |

### 📈 How it compares to alternatives

| Approach | Setup | Pagination | Flattening | Export formats |
|---|---|---|---|---|
| Raw `curl` | 5 min | manual | none | manual |
| DIY Python script | 30 min | yes | partial | code |
| **This Actor** | 5 seconds | yes | yes | 7 formats |

### 🚀 How to use

1. Click **Try for free**.
2. Pick your filters from the schema above.
3. Click **Start**. Your dataset is ready in seconds.

### 💼 Business use cases

**📊 Analytics.** Mirror MLB StatsAPI into a warehouse for dashboards.

**🏢 Internal tooling.** Mirror the data into private apps without writing client code.

**📰 Journalism.** Verify and bulk-fetch records for stories.

**🤖 Machine learning.** Build training sets from a known canonical source.

### 🔌 Automating MLB StatsAPI Public Scraper

- **Make / Zapier**: schedule a daily run.
- **Cron schedule**: native Apify scheduler.
- **Webhooks**: POST on completion.
- **Warehouse pipe**: native integrations move datasets straight into BigQuery, Snowflake, or Postgres.

### 🌟 Beyond business use cases

**🎓 Education.** Teach API integration with a clean dataset.

**🧪 Personal research.** Track the data you care about.

**🤝 Non-profit and open data.** Power public dashboards.

**🧰 Tinkering and prototyping.** Spin up a feed for side projects in seconds.

### 🤖 Ask an AI assistant about this scraper

Drop this README into ChatGPT, Claude, or any AI assistant and ask it to design a pipeline. The input fields, schema, and examples above contain everything an LLM needs.

### ❓ Frequently Asked Questions

**❓ Do I need an API key?** No. The endpoint is public.

**❓ Pagination?** Yes, handled automatically where the upstream supports it.

**❓ Rate limits?** The upstream sets the rate limit. The actor surfaces upstream errors cleanly.

**❓ Schema stability?** Core fields are stable. Optional fields are passed through when present.

**❓ Real-time?** Yes, every run hits the live endpoint.

**❓ Spreadsheet export?** Yes, via the Apify dataset UI.

**❓ Scheduling?** Yes, via the Apify scheduler.

**❓ Public data only?** Yes.

**❓ Free trial?** Yes, $5 free credit on signup covers many runs.

**❓ Webhook integration?** Yes, native Apify webhooks fire on run completion.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook endpoint. Trigger runs from a calendar event, a form submission, a cron job, or pipe results straight into BigQuery, Snowflake, or a Postgres warehouse.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge Alpha Vantage Public Scraper](https://apify.com/parseforge/alpha-vantage-public-scraper) | Public stock, FX, and crypto market data. |
| [ParseForge OurAirports Scraper](https://apify.com/parseforge/ourairports-scraper) | Global airport database. |
| [ParseForge Civitai Models Scraper](https://apify.com/parseforge/civitai-models-scraper) | Public Civitai model catalogue. |
| [ParseForge Hugging Face Spaces Scraper](https://apify.com/parseforge/huggingface-spaces-scraper) | Public Hugging Face Spaces metadata. |

> 💡 **Pro Tip.** browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.

***

**Disclaimer.** This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `endpoint` (type: `string`):

MLB StatsAPI endpoint to call.

## `maxItems` (type: `integer`):

Free users are limited to 10 items (preview). Paid users can collect up to 1,000,000 items.

## `sportId` (type: `integer`):

MLB sport id. 1 = MLB.

## `season` (type: `integer`):

Optional season year.

## Actor input object example

```json
{
  "endpoint": "teams",
  "maxItems": 10,
  "sportId": 1
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/mlb-statsapi-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxItems": 10 }

# Run the Actor and wait for it to finish
run = client.actor("parseforge/mlb-statsapi-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10
}' |
apify call parseforge/mlb-statsapi-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/mlb-statsapi-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "MLB StatsAPI Baseball Data Scraper",
        "description": "Tap the official MLB StatsAPI for teams, players, schedules, venues, leagues, divisions, and seasons. Filter by sport id and season year to build rosters, standings, or game calendars. Useful for fantasy tools, baseball analytics, and historical season research.",
        "version": "0.1",
        "x-build-id": "TCADJEsGjbekcm20C"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~mlb-statsapi-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-mlb-statsapi-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~mlb-statsapi-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-mlb-statsapi-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~mlb-statsapi-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-mlb-statsapi-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "endpoint": {
                        "title": "Endpoint",
                        "enum": [
                            "teams",
                            "people",
                            "schedule",
                            "venues",
                            "leagues",
                            "divisions",
                            "seasons"
                        ],
                        "type": "string",
                        "description": "MLB StatsAPI endpoint to call.",
                        "default": "teams"
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users are limited to 10 items (preview). Paid users can collect up to 1,000,000 items."
                    },
                    "sportId": {
                        "title": "Sport id",
                        "type": "integer",
                        "description": "MLB sport id. 1 = MLB.",
                        "default": 1
                    },
                    "season": {
                        "title": "Season",
                        "type": "integer",
                        "description": "Optional season year."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
