LMArena LLM Leaderboard Scraper avatar

LMArena LLM Leaderboard Scraper

Pricing

Pay per event

Go to Apify Store
LMArena LLM Leaderboard Scraper

LMArena LLM Leaderboard Scraper

Scrape the LMArena (Chatbot Arena) ELO leaderboard — ranks, ratings, vote counts, and confidence intervals across all arena variants (text, code, vision, document, image, video, and more). Returns one row per model per leaderboard variant.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Scrape the LMArena Chatbot Arena ELO leaderboard — the most widely cited blind-evaluation ranking for large language models. Every major model launch references its Chatbot Arena ELO position. This actor returns one row per model per leaderboard variant, covering all 11 arena categories in a single run.

What You Get

Each record contains:

  • Leaderboard variant — arena slug (text, code, vision, document, text-to-image, image-edit, image-to-code, search, text-to-video, image-to-video, video-to-video)
  • Rank — current position with 95% confidence interval bounds (rank_upper, rank_lower)
  • Model name and organization
  • ELO score with confidence interval low/high
  • Vote count — number of blind battles used to compute this rating
  • License — Proprietary, Apache-2.0, etc.
  • Context window tokens
  • Input/output pricing (per million tokens, where publicly available)
  • Profile URL — link to official model announcement
  • Scraped at — ISO 8601 timestamp

Usage

Basic run (all arenas, default 10 records)

No configuration needed — just run with default settings to get a sample of the leaderboard.

Full leaderboard

Set maxItems to a high number (e.g., 1000) to get all ~900+ entries across all arenas.

Filter by arena

Use the arenas input to limit results to specific variants:

{
"maxItems": 500,
"arenas": ["text", "code", "vision"]
}

Available arena slugs:

  • text — overall text/conversation benchmark (the main leaderboard)
  • code — coding tasks
  • vision — image understanding
  • document — document Q&A
  • text-to-image — image generation
  • image-edit — image editing
  • image-to-code — screenshot-to-code
  • search — search tasks
  • text-to-video — video generation
  • image-to-video — image-to-video
  • video-to-video — video editing

Input Schema

FieldTypeDefaultDescription
maxItemsinteger10Maximum total records to return
arenasarray[] (all)Arena slugs to include. Empty = return all arenas

Output Sample

{
"leaderboard_variant": "text",
"rank": 1,
"rank_upper": 1,
"rank_lower": 4,
"model_name": "claude-opus-4-6-thinking",
"organization": "Anthropic",
"elo_score": 1502.17,
"elo_confidence_low": 1497.91,
"elo_confidence_high": 1506.43,
"vote_count": 34186,
"license": "Proprietary",
"context_window_tokens": 1000000,
"input_price_per_million": 5,
"output_price_per_million": 25,
"profile_url": "https://www.anthropic.com/news/claude-opus-4-6",
"scraped_at": "2026-06-01T15:05:33.265Z"
}

Technical Notes

  • Single HTTP request — all arena data is embedded in the page's server-side rendered response (~4.6 MB). No pagination, no API calls.
  • No proxy required — the site serves datacenter IPs without blocking.
  • Fast — typical run completes in under 30 seconds.
  • The leaderboard updates daily as new arena battles are processed.

Use Cases

  • Track ELO rankings over time for specific models or organizations
  • Compare models across different task categories (code vs. vision vs. overall)
  • Feed into model-routing or evaluation pipelines
  • Monitor competitive positioning for newly launched models
  • Research and academic benchmarking datasets