# MLB Top 100 Prospects Scraper (`parseforge/mlb-prospects-scraper`) Actor

Pull records from multiple Mlb sources in a single run and get a unified, normalized result set. Pull names, identifiers, dates, descriptions, status flags and source links per record. Built for research, lead generation and intelligence pipelines. Run on demand or on a recurring schedule and fee.

- **URL**: https://apify.com/parseforge/mlb-prospects-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Other, News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## ⚾ MLB Prospects Scraper

> 🚀 **Get mlb prospects data in seconds.** Get MLB.com top prospect rankings: player name, rank, team, position, height, weight, bats, throws and prospect grade by year. data for baseball scouting and fantasy analytics.

> 🕒 **Last updated** 2026-05-27 · **📊 17 fields** per record · **Top 100 + team-by-team prospect lists** · **MLB (all 30 teams)**

The MLB Prospects Scraper extracts structured records from MLB.com Prospects. Every record captures the canonical fields you would expect from the upstream source - ready for analytics, dashboards, BI tooling, or further enrichment.

The dataset spans mlb (all 30 teams) and exposes the same data the official source publishes - normalised, paginated, and exportable.

**Who uses this data?**

| Audience | Use Case |
|---|---|
| Researchers | Build longitudinal datasets for analysis |
| Compliance teams | Monitor regulated entities and filings |
| Journalists | Investigate public records at scale |
| Data analysts | Power BI dashboards and reports |
| Academic researchers | Run quantitative studies on public data |
| Product teams | Embed live mlb prospects records into apps |

### 📋 What the MLB Prospects Scraper does

- Queries MLB.com Prospects for the latest records
- Supports targeted filtering via the input parameters (see below)
- Returns structured records with 17 fields, no scraping artifacts
- Handles pagination automatically up to your `maxItems` limit
- Cleans HTML entities and normalises dates and identifiers
- Delivers results as data, or data via Apify dataset Get

> 💡 **Why it matters:** Public-data sources rarely offer bulk Get. This Actor turns the live source into a queryable dataset.

### 🎬 Full Demo

_🚧 Coming soon_

### ⚙️ Input

<table>
<thead>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
</thead>
<tbody>
<tr><td><b>maxItems</b></td><td>integer</td><td>No</td><td>Free users: 10. Paid users: optional, max 1,000,000.</td></tr>
<tr><td><b>year</b></td><td>string</td><td>No</td><td>Four-digit year (e.g. 2024).</td></tr>
</tbody>
</table>

**Example 1: Default run (preview)**

````

{
"maxItems": 10
}

```

**Example 2: Targeted query**

```

{
"maxItems": 5,
"year": "2024"
}

````

> ⚠️ **Good to Know:** Free users are limited to 10 items per run. Upgrade to a paid plan to unlock up to 1,000,000 items.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 🖼 Photo `imageUrl` | string | - |
| ⚾ Name `title` | string | - |
| 🔗 URL `url` | string | - |
| 🔑 ID `id` | string | - |
| 🏆 Rank `rank` | string | - |
| 📅 Year `year` | string | - |
| ⚾ Position `position` | string | - |
| 🏟️ Team `team` | string | - |
| 👤 First `firstName` | string | - |
| 👤 Last `lastName` | string | - |
| 🎂 DOB `birthDate` | string | - |
| 📏 Height `height` | string | - |
| ⚖️ Weight `weight` | string | - |
| 🏏 Bats `batSide` | string | - |
| 🤾 Throws `pitchHand` | string | - |
| 🕒 Collected `scrapedAt` | string | - |
| ❌ Error `error` | string | Error message if record could not be retrieved |

### ✨ Why choose this Actor

| Feature | Detail |
|---|---|
| 🌐 No login required | Public data only - no credentials needed |
| 🔍 Targeted filtering | Search and filter via input parameters |
| 📊 Top 100 + team-by-team prospect lists | Comprehensive coverage of mlb (all 30 teams) |
| 🧹 Clean output | Dates normalised, HTML stripped, arrays flattened |
| ⚡ Fast | Direct source access without browser overhead |
| 🔄 Auto-pagination | Retrieves results up to your maxItems |
| 💾 4 Get | data all available |
| 🛡️ Retry logic | Multi-attempt retry with backoff for reliability |

### 📈 How it compares to alternatives

| Method | Speed | Scale | Structured Output | Free |
|---|---|---|---|---|
| **This Actor** | Fast | 1,000,000 records | Yes (17 fields) | 10 free / unlimited paid |
| Manual site search | Slow | Limited per session | No | Yes |
| Bulk data access | Slow setup | Variable | Partial | Variable |
| Custom API script | Variable | Unlimited | Requires dev work | Dev cost |

### 🚀 How to use

1. **[Create a free Apify account](https://console.apify.com/sign-up?fpr=vmoqkp)** - includes $5 free credit
2. Open the **MLB Prospects Scraper** actor page
3. Configure your input - set filters or leave defaults
4. Set `maxItems` (10 for a quick preview, higher for bulk extraction)
5. Click **Run** and wait for the dataset to populate
6. access your results as **data, or data**

### 💼 Business use cases

#### Compliance and Due Diligence
Teams use mlb prospects data to verify entities, monitor regulatory status, and feed downstream pipelines.

#### Market Research
Analysts map mlb (all 30 teams) to understand market structure, competitive activity, or regulatory trends.

#### Lead Generation
Sales teams enrich CRM records with mlb prospects data to identify prospects and qualify leads.

#### Investigative Journalism
Reporters use bulk extracts to find patterns, anomalies, and stories hidden in public records.

### 🔌 Automating MLB Prospects Scraper

Connect this Actor to your existing workflows using Apify integrations:

- **Make (Integromat)** - trigger a run on a schedule and push results to Google Sheets or a database
- **Zapier** - automatically Get new records to Airtable, Notion, or your CRM
- **Slack** - get notified when a monitored entity has new activity
- **Webhooks** - receive real-time notifications when a run completes

### 🌟 Beyond business use cases

#### Academic Research
Study trends, behaviours, and patterns in mlb (all 30 teams) records across time.

#### Civic Tech and Transparency
Civic-tech projects build dashboards on top of bulk extracts to surface public-interest insights.

#### Education
Educators use real mlb prospects data to teach data analysis and domain knowledge.

#### Personal Research
Individuals - researchers, hobbyists, family historians - use bulk Get to answer questions that no single web search can.

### 🤖 Ask an AI assistant about this scraper

Paste a few sample records into ChatGPT, Claude, or another AI assistant and ask it to summarise the dataset, explain fields, identify patterns, or suggest filter combinations.

### ❓ Frequently Asked Questions

**🔍 What does this Actor do?**
It extracts mlb prospects records from MLB.com Prospects into a clean, queryable dataset.

**📊 How many records are available?**
Top 100 + team-by-team prospect lists.

**🔑 Do I need an account or API key?**
No. This Actor uses public sources that require no authentication.

**📅 How up-to-date is the data?**
The Actor queries the source live on every run.

**🔍 Can I filter by specific fields?**
Yes - see the Input section above for all supported filters.

**⚡ How fast is the scraper?**
A typical 10-item preview completes in under 30 seconds.

**📄 What output is the output?**
Records are stored in Apify's dataset storage and can be exported as data, or data.

**🏆 Why are some fields null for certain records?**
Some fields are optional at the source. The Actor returns null rather than fabricating values.

**📋 Can I run this on a schedule?**
Yes - use Apify's built-in scheduler or trigger via Make / Zapier / Webhooks.

**💰 Is there a cost to use this Actor?**
Free users receive 10 records per run. [Create a paid account](https://console.apify.com/sign-up?fpr=vmoqkp) to unlock up to 1,000,000 records per run.

**🌍 Does it cover mlb (all 30 teams)?**
Yes - see the coverage line above.

**🛡️ Is this Actor compliant?**
The Actor accesses only publicly available data, in line with the source's published terms of service.

### 🔌 Integrate with any app

Get data directly from the Apify platform to:

**Spreadsheets & Databases**
Google Sheets - Microsoft data - Airtable - Notion - PostgreSQL - MySQL - MongoDB

**Automation & Workflows**
Make (Integromat) - Zapier - n8n - Pipedream - Activepieces

**Cloud Storage**
AWS S3 - Google Cloud Storage - Azure Blob Storage - Dropbox

**APIs & Webhooks**
REST API - Webhooks - Apify API

### 🔗 Recommended Actors

| Actor | Description |
|---|---|
| [OurAirports Global Airport Database Scraper](https://apify.com/parseforge/ourairports-scraper) | Get worldwide airport data including ICAO/IATA codes |
| [FINRA BrokerCheck Scraper](https://apify.com/parseforge/finra-brokercheck-scraper) | Extract broker and firm registration data from FINRA |
| [Hacker News Stories Scraper](https://apify.com/parseforge/hackernews-stories-scraper) | Pull live Hacker News top stories with score and comments |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more public-data scrapers.

---

Need help? Visit the [Apify Discord community](https://discord.gg/jyEM2PRvMU) or open a support ticket.

**Disclaimer:** This Actor accesses publicly available data from MLB.com Prospects in compliance with the source's terms of service. Data is provided as-is for informational purposes. Verify all records against the official source before relying on them for legal or business decisions.

# Actor input Schema

## `maxItems` (type: `integer`):

Free users: 10. Paid users: optional, max 1,000,000.
## `year` (type: `string`):

Four-digit year (e.g. 2024).

## Actor input object example

```json
{
  "maxItems": 10,
  "year": "2024"
}
````

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10,
    "year": "2024"
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/mlb-prospects-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 10,
    "year": "2024",
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/mlb-prospects-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10,
  "year": "2024"
}' |
apify call parseforge/mlb-prospects-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/mlb-prospects-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "MLB Top 100 Prospects Scraper",
        "description": "Pull records from multiple Mlb sources in a single run and get a unified, normalized result set. Pull names, identifiers, dates, descriptions, status flags and source links per record. Built for research, lead generation and intelligence pipelines. Run on demand or on a recurring schedule and fee.",
        "version": "0.1",
        "x-build-id": "GZH4ivwcoxzFapDa8"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~mlb-prospects-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-mlb-prospects-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~mlb-prospects-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-mlb-prospects-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~mlb-prospects-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-mlb-prospects-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: 10. Paid users: optional, max 1,000,000."
                    },
                    "year": {
                        "title": "Prospect Year",
                        "type": "string",
                        "description": "Four-digit year (e.g. 2024)."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
