Pricing

Pay per event

Artificial Analysis AI Model Benchmark Scraper

Scrapes LLM benchmark scores, pricing, and performance data from Artificial Analysis — the leading independent evaluator of AI models.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

Actor stats

Bookmarked

Total users

Monthly active users

13 days ago

Last modified

What this actor does

Extracts structured data for ~370 AI language models from Artificial Analysis, including:

Benchmark scores: Quality index, MMLU-Pro, GPQA Diamond, HumanEval, LiveCodeBench, MATH-500, MMMU-Pro, and more
Pricing: Input, output, and blended cost per million tokens
Performance: Median throughput (tokens/sec) and time-to-first-token latency
Provider info: All hosting providers, cheapest provider by blended price
Model metadata: Creator/lab, release date, parameter count, context window, license, open-weight status

All data is extracted in a single request to the /models page, which serves the full model dataset inline as a React Server Component payload. No per-model crawling needed.

Use cases

Model selection: Compare cost-vs-quality trade-offs across providers
Price monitoring: Track pricing changes across OpenAI, Anthropic, Google, Meta, and 40+ hosting providers
Research and benchmarking: Import baseline scores into your own evaluation pipeline
Cost optimization: Find the cheapest or fastest provider for a given quality target

Input

Field	Type	Required	Default	Description
`maxItems`	integer	Yes	10	Maximum number of model records to return. Set to a large number (e.g. 500) to retrieve all models.

Output

Each dataset item represents one AI model. Example record:

{
  "model_slug": "claude-4-opus",
  "model_name": "Claude 4 Opus",
  "provider": "Anthropic",
  "release_date": "2025-05-22",
  "parameter_count": null,
  "context_window_tokens": 200000,
  "aa_quality_index": 57.4,
  "mmlu_pro_score": 0.812,
  "gpqa_diamond_score": 0.738,
  "humaneval_score": 0.921,
  "math_score": 84.1,
  "chatbot_arena_elo": null,
  "aider_polyglot_score": null,
  "livecodebench_score": 0.703,
  "mmmu_score": null,
  "benchmark_breakdown": "{\"agentic_index\":45.2,\"coding_index\":68.1,...}",
  "price_input_usd_per_million": 15,
  "price_output_usd_per_million": 75,
  "price_blended_usd_per_million": 30,
  "throughput_tokens_per_second": 58.3,
  "latency_first_token_ms": 1204,
  "hosting_providers": "[\"Anthropic\",\"Amazon Bedrock\",\"Google Vertex AI\"]",
  "cheapest_provider": "Amazon Bedrock",
  "fastest_provider": null,
  "license": "proprietary",
  "is_open_weight": false,
  "profile_url": "https://artificialanalysis.ai/models/claude-4-opus",
  "scraped_at": "2026-05-31T08:00:00.000Z"
}

Notes on specific fields:

chatbot_arena_elo and aider_polyglot_score are always null — these metrics are not tracked by Artificial Analysis and would require separate scrapers from Chatbot Arena and Aider.chat.
benchmark_breakdown is a JSON string containing additional sub-benchmarks (agentic_index, coding_index, math_index, HLE, AIME-2025, IFBench, SciCode, LCR, Omniscience).
hosting_providers is a JSON string array of all providers offering this model.
fastest_provider is always null — per-provider throughput breakdown is not available on the listing page.

Notes

The actor makes a single HTTP request to https://artificialanalysis.ai/models. No proxy required.
The full dataset (~370 models) is available in one request. Use maxItems: 500 to get everything.
Prices and benchmarks on Artificial Analysis update frequently — run the actor periodically for up-to-date data.

LLM Benchmark Leaderboard Scraper - AI Model Performance Data

outofboundslab/llm-benchmark-scraper

Scrape LLM benchmark leaderboards from Artificial Analysis. Get model names, intelligence scores, coding scores, speed metrics, pricing, and provider information. Essential for AI researchers, model selectors, and tech analysts.

Julian Bracaglia

Artificial Analysis Top Lists

truenorth/artificial-analysis-top-lists

Extract AI model rankings by intelligence, output speed, and cost per task from Artificial Analysis.

TrueNorth

Benchmark Aggregator

wild_equipment/benchmark-aggregator

Zhang Luxin

Artificial Intelligence News

visita/artificial-intelligence-news

Transform the overwhelming flood of artificial intelligence news into precise, actionable intelligence. This actor monitors 25+ premier AI research blogs and news feeds, using advanced LLM analysis to extract model updates, benchmarks, and industry-defining shifts.

Visita Intelligence

Benchmark International Business Listing Scraper 🏢📈📊

scrapestorm/benchmark-international-business-listing-scraper

🔎 Easily collect Benchmark International listings by providing one or multiple Benchmark search URLs Extract business insights such as 🏢 Business Description 🏭 Industry 📍 Location 💰 Revenue 👤 Contact Name 📧 Email 📞 Phone & more Perfect for M&A deal sourcing & business opportunity discovery

Storm_Scraper

5.0

AI Benchmark Claim Normalizer

flintglade/ai-benchmark-claim-normalizer

Normalize structured AI benchmark claims and produce evidence-backed comparability refusals when versions, splits, metrics, prompts, shots, or model settings do not align.

Flintglade

Apple Podcasts Category Benchmark

taroyamada/podcast-category-network-benchmark-report

Benchmark podcast categories, publishers, and show groups from Apple Podcasts metadata and public RSS. Deliver media-planning signals, reports, and exports.

naoki anzai

Benchmark International Business Scraper - Low-cost💲🔥🏢📈

delectable_incubator/benchmark-international-business-scraper---low-cost

Scrape Benchmark International listings 🔎🏢 with a powerful business intelligence scraper. Extract descriptions, industries, locations, revenue, contacts, emails, phones, and more from search URLs. Ideal for M&A deal sourcing, lead generation, and business opportunity discovery 📊🚀

Prime Scrape

5.0

Executive Thought-Leadership Benchmark

kayhermes/executive-thought-leadership-benchmark

Khoa Nguyen

App Store Release & Review Benchmark

taroyamada/app-release-category-review-benchmark-report

Benchmark Apple App Store and Google Play releases, ratings, and review regressions. Deliver source-linked app comparisons, alerts, reports, and exports.

naoki anzai