Deprecated

Pricing

Pay per usage

See alternative Actors

Go to Apify Store

Mistral AI Models Scraper

Deprecated

See alternative Actors

Scrape all Mistral AI models — API identifiers, context window, capabilities, categories, and deprecation info from docs.mistral.ai.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

7 days ago

Last modified

🤖 What does it do?

This actor scrapes the official Mistral AI documentation and returns a structured dataset of every Mistral model — from flagship frontier models like Mistral Large 3 and Magistral to specialized tools like Codestral, Voxtral, and legacy models. For each model you get:

🆔 API identifier — the exact string to use in your API calls (e.g. mistral-small-2603)
📅 Version — release date code (e.g. 26.03)
🪟 Context window — how many tokens the model can process
🏷️ Latest alias — the mistral-large-latest style pointer alias
📂 Category — Featured / Generalist / Specialist / Other / Legacy
✅ Open-weight status — whether model weights are publicly available
⚠️ Deprecation info — deprecation date, retirement date, and recommended replacement

The actor combines two scraping strategies: it parses the React Server Component (RSC) payload embedded in the overview page for comprehensive legacy model data, and fetches individual model card pages for active model details. No Playwright, no browser — pure HTTP.

👥 Who is it for?

AI developers and engineers comparing Mistral models for API integration who need to know which model ID to call and what the context limits are.

LLM cost analysts tracking Mistral's model lineup to calculate token costs and choose the right tier for their workloads.

AI researchers monitoring new releases, deprecations, and open-weight model availability from Mistral AI.

DevOps and MLOps teams maintaining API integrations who need programmatic access to the current model list to keep configurations up to date.

AI comparison tools that aggregate model specs across providers (Groq, DeepInfra, Fireworks, Together AI, etc.) to give users a unified view.

💡 Why use this scraper?

Mistral doesn't provide a public unauthenticated REST API to list all models. Their /v1/models endpoint requires an API key. This actor fetches the same data that's publicly visible on the documentation website — no API key needed, no rate limits to worry about.

You get all 59+ models in one clean dataset: current models, deprecated models (with their replacement recommendations), retired models, and everything in between. Scheduling the actor daily keeps your tooling automatically in sync when Mistral releases a new model or retires an old one.

📊 Data you will extract

Field	Description	Example
`modelId`	Primary API identifier	`mistral-small-2603`
`modelName`	Human-readable name	`Mistral Small 4`
`description`	Short model description	`Our powerful hybrid model...`
`version`	Release version code	`26.03`
`apiIdentifiers`	All API name aliases (comma-separated)	`mistral-small-2603, mistral-small-latest`
`latestAlias`	The `-latest` pointer alias	`mistral-small-latest`
`category`	Model category	`Generalist`
`section`	Section on the docs page	`Frontier Models`
`isOpenWeight`	Whether model weights are public	`true`
`contextLength`	Context window size	`256k`
`inputCapabilities`	Supported input types	`text, image`
`outputCapabilities`	Supported output types	`text`
`features`	Supported API features	`function-calling, structured-outputs`
`status`	Active / Deprecated / Retired	`Active`
`deprecationDate`	When deprecation starts	`March 31, 2026`
`retirementDate`	When model is retired	`April 30, 2026`
`replacementModel`	Recommended replacement	`Mistral Nemo 12B`
`modelUrl`	Link to model card	`https://docs.mistral.ai/models/...`
`scrapedAt`	ISO timestamp of scrape	`2026-04-26T09:00:00.000Z`

💰 How much does it cost to scrape Mistral AI models?

This is a very lightweight actor. It makes approximately 60 HTTP requests (one overview page + one model card per model). No proxies needed. No browser rendering.

Tier	Active models only (~23)	All models (~59)
Free	~$0.012	~$0.023
Bronze	~$0.011	~$0.020
Diamond	~$0.007	~$0.009

The $0.005 start fee covers the overview page fetch. Each model extracted costs a fraction of a cent. A daily scheduled run costs under $1/month.

ℹ️ You can run this actor on Apify's Free plan — the default input will complete well within the free compute limits. Start by clicking Try for free on the actor's Store page.

🚀 How to use this actor

Step 1 — Open the actor

Go to Mistral AI Models Scraper on Apify Store.

Step 2 — Configure input

The actor works with zero configuration. Click Start to run with defaults.

To exclude deprecated/retired legacy models, uncheck Include deprecated/legacy models.

Step 3 — Run and download

Click Start and the actor completes in under 60 seconds. Download your data as JSON, CSV, or Excel from the Dataset tab.

Step 4 — Schedule for freshness

Use Apify's scheduling to run daily or weekly, and your downstream tooling always has the latest Mistral model list.

⚙️ Input parameters

Parameter	Type	Default	Description
`includeDeprecated`	Boolean	`true`	Include legacy, deprecated, and retired models in output
`maxConcurrency`	Integer	`5`	Parallel model card page fetches (1–20)
`maxRequestRetries`	Integer	`3`	Retry attempts for failed HTTP requests

📤 Output example

{
  "modelId": "mistral-small-2603",
  "modelName": "Mistral Small 4",
  "description": "Our powerful hybrid model unifying instruct, reasoning, and coding capabilities in a single model. 119B parameters with 6.5B active.",
  "version": "26.03",
  "apiIdentifiers": "mistral-small-2603, mistral-small-latest",
  "latestAlias": "mistral-small-latest",
  "category": "Generalist",
  "section": "Frontier Models",
  "isOpenWeight": true,
  "contextLength": "256k",
  "inputCapabilities": null,
  "outputCapabilities": null,
  "features": null,
  "status": "Active",
  "deprecationDate": null,
  "retirementDate": null,
  "replacementModel": null,
  "modelUrl": "https://docs.mistral.ai/models/model-cards/mistral-small-4-0-26-03",
  "scrapedAt": "2026-04-26T09:05:22.373Z"
}

🧠 Tips and tricks

Filter to active models only — set includeDeprecated: false to get just the 23 currently active models. This is faster and cheaper.
Finding the right model ID — the apiIdentifiers field contains all valid API names. Use the versioned ID (e.g. mistral-small-2603) for stable integrations; use the latestAlias (e.g. mistral-small-latest) if you always want the newest version.
Checking for deprecations — sort or filter by status to find models entering deprecation. The replacementModel field tells you where to migrate.
Context window comparisons — filter for models where contextLength equals 256k to find all long-context options.
Scheduling for monitoring — schedule daily runs and use the scrapedAt timestamp to compare consecutive runs for changes.

🔗 Integrations

🤖 Build a model comparison tool

Run this actor alongside Groq Models Scraper and DeepInfra Models Scraper. Push all datasets into a single database and build an always-fresh cross-provider model catalog that your team can query by context window, feature support, or cost.

📊 Feed into a Google Sheet

Use Apify's Google Sheets integration to automatically push updated model data to a spreadsheet. Share it with your team so everyone knows which Mistral model IDs are active.

🔔 Alert on model deprecations

Use Apify's scheduling + webhooks to run this actor daily. Compare the latest output against the previous run (use the Apify Dataset API to fetch the last N runs). Fire a Slack or email notification whenever a model's status changes from Active to Deprecated.

🛠️ Keep API configs in sync

Integrate this actor into your CI/CD pipeline. Before deploying, fetch the current model list and validate that your configured model IDs still exist and are not deprecated. Fail the build if a model is found to be retiring within 30 days.

🔌 API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('automation-lab/mistral-models-scraper').call({
    includeDeprecated: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Scraped ${items.length} Mistral AI models`);
items.filter(m => m.status === 'Active').forEach(m => {
    console.log(`${m.modelName}: ${m.apiIdentifiers} (${m.contextLength})`);
});

Python

from apify_client import ApifyClient

client = ApifyClient(token="YOUR_API_TOKEN")

run = client.actor("automation-lab/mistral-models-scraper").call(run_input={
    "includeDeprecated": True
})

items = client.dataset(run["defaultDatasetId"]).list_items().items
active = [m for m in items if m["status"] == "Active"]
print(f"Found {len(active)} active Mistral models")
for model in active:
    print(f"{model['modelName']}: {model['apiIdentifiers']}")

cURL

# Start the actor
curl -X POST "https://api.apify.com/v2/acts/automation-lab~mistral-models-scraper/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"includeDeprecated": true}'

# Fetch results (replace RUN_ID with the run ID from the response above)
curl "https://api.apify.com/v2/datasets/RUN_ID/items?token=YOUR_API_TOKEN"

🤖 MCP (Model Context Protocol) integration

Use this actor directly inside Claude, Cursor, VS Code, or any MCP-compatible AI assistant to query Mistral model data in natural language.

Claude Code / CLI setup

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/mistral-models-scraper"

Claude Desktop / Cursor / VS Code (JSON config)

{
  "mcpServers": {
    "apify": {
      "type": "http",
      "url": "https://mcp.apify.com?tools=automation-lab/mistral-models-scraper",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Example prompts to try

"List all active Mistral models with their API identifiers and context windows"
"Which Mistral models are being deprecated in 2026 and what should I migrate to?"
"Find all open-weight Mistral models I can run locally"
"Compare Mistral Small 4 and Mistral Large 3 context window sizes"
"Which Mistral models support function calling?"

⚖️ Legality

This actor scrapes publicly available information from docs.mistral.ai — the official Mistral AI documentation website. The data is the same as what you'd see visiting the page in a browser. No authentication is required or bypassed. The actor respects the site's server by using reasonable concurrency limits.

Always review the Mistral AI Terms of Service and Privacy Policy before using scraped data in commercial products.

❓ FAQ

What models does this actor scrape?

All models listed on the Mistral AI models overview page, including active frontier models, specialist models, other models, and the full legacy/deprecated history. As of April 2026, that's 59+ models.

Does this include API pricing data?

Pricing data for legacy models is partially available in the underlying RSC data, but is not currently included in the output schema. The output focuses on model identification, capabilities, and lifecycle data. For pricing, check docs.mistral.ai directly.

Why do some models have `null` for contextLength?

Some specialist models (audio transcription, OCR, TTS) don't have a traditional token context window and don't display one on their model cards. For those, contextLength will be null.

The actor returned fewer than 59 models — what happened?

Mistral regularly adds new models to their catalog. If a new model card page returns an error on first fetch, the actor will retry up to maxRequestRetries times. If the page structure changes significantly, the actor may skip some models. Check the actor logs for warnings about failed fetches.

I need the context window in tokens, not "128k"

The actor returns the context length as displayed on the Mistral docs page (e.g. 128k, 256k, 32k). To convert: 128k = 128,000 tokens. No rounding or conversion is applied.

Groq Models Scraper — All models available on Groq's API with speeds and pricing
DeepInfra Models Scraper — DeepInfra model catalog with pricing
Fireworks AI Models Scraper — Fireworks AI model list
Cloudflare Workers AI Models Scraper — Cloudflare AI model catalog

HuggingFace Hub Scraper - Models, Datasets, Spaces & Authors

makework36/huggingface-hub-scraper

Scrape HuggingFace Hub: models, datasets, spaces. 30+ fields per record, trending filters, author profiles, parsed tags, web enrichment for emails & websites.

deusex machine

OpenRouter AI Model Pricing Scraper

parseforge/openrouter-models-pricing-scraper

Scrape AI model catalog and pricing from OpenRouter public API. Get prompt/completion price per token, context length, modality, top providers, and supported features for 300+ AI models. No API key required.

ParseForge

Hugging Face Scraper - Models Datasets Spaces

openclawmara/huggingface-scraper

Scrape Hugging Face models, datasets, and Spaces. Extracts metadata, downloads, likes, tags, and usage stats. Ideal for AI model discovery, competitive analysis, and tracking trending ML resources.

OpenClaw Mara

Findify Best

gnyselcuk/findify-best

🔍 AI-powered e-commerce scraper that extracts detailed product data from any online store. Uses LLMs (Mistral/Gemini) for intelligent extraction, handles pagination, variants & CAPTCHAs. Perfect for price monitoring, market research & competitive analysis. #webscraping #ecommerce

selçuk güney

Ai Model Pricing

pink_fence/Ai-Model-Pricing

Scrape live AI model pricing from OpenAI, Anthropic, Google Gemini and Mistral in one run. Input and output price per 1M tokens, context window size and more. Perfect for cost tracking and n8n workflows.

Moritz Knopp

LLM Data Pipeline Pro

sanztheo/llm-data-pipeline-pro

Transform websites into LLM training data. Scrape, validate, deduplicate, chunk for RAG, and export to OpenAI/Anthropic/Mistral formats. Built-in PII detection and GDPR compliance. Vector DB export to Pinecone & Qdrant.

Theo Sanz

Hugging Face Scraper — AI Models, Datasets, Spaces & Papers

logiover/huggingface-hub-intelligence-scraper

Export every AI model, dataset, space and daily paper from the Hugging Face Hub. Filter by task, library (transformers, diffusers, GGUF), language, license, author. Sort by downloads, likes, trending. Sibling files + README. Public HF API, no token. For AI builders, ML research, RAG and VC AI intel.

Logiover

AI Travel Planner

akash9078/ai-travel-planner

Generate personalized travel itineraries using AI-powered research and planning. This Actor uses Mistral Large (latest) to create detailed day-by-day travel plans with built-in web research via Apify SERP proxy. It outputs both a text itinerary and an ICS calendar file for easy import.

Akash Kumar Naik

OpenRouter Model Scraper

datapilot/openrouter-model-scraper

OpenRouter Models Scraper extracts AI model metadata from OpenRouter API, including pricing, context length, providers, modalities, token limits, vision/tool support, JSON support, and model architecture. Supports keyword filtering, proxy rotation, and structured dataset

Data Pilot

Find Linkedin Company Page Urls

sbzh/domain-names-or-website-urls-to-linkedin-company-page-urls

Use this tool to retrieve the LinkedIn URLs from websites. Simply enter a list of domain names or website URLs and, when available, retrieve the LinkedIn URL of the company page in the format https://www.linkedin.com/company/...

Sambzh

126

1.0

(1)

Crunchbase Company Organization Scraper

saswave/crunchbase-company-organization-scraper

Extract rich, structured company data from Crunchbase organization profiles. Ideal for market research, lead generation, competitive analysis, and investment intelligence.

SASWAVE