ARC Prize Leaderboard Scraper avatar

ARC Prize Leaderboard Scraper

Pricing

Pay per event

Go to Apify Store
ARC Prize Leaderboard Scraper

ARC Prize Leaderboard Scraper

Scrapes ARC Prize leaderboard data (ARC-AGI-1/2/3 benchmarks) for all AI models including scores, costs, providers, and rankings

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

12 days ago

Last modified

Categories

Share

Extract AI model rankings, scores, costs, and metadata from the ARC Prize leaderboard β€” covering all ARC-AGI-1, ARC-AGI-2, and ARC-AGI-3 benchmark datasets.

πŸ“– What does it do?

The ARC Prize Leaderboard Scraper extracts structured benchmark data from arcprize.org/leaderboard, the official leaderboard for the ARC Prize competition β€” the leading benchmark for measuring progress toward artificial general intelligence (AGI).

Give it a list of benchmark versions (v1, v2, v3) and it returns every model's performance data including scores, costs per task, provider information, model types, and release dates.

What you get for each leaderboard entry:

  • Model name and display label
  • Provider/organization name
  • ARC-AGI benchmark version and dataset
  • Score (0–1 accuracy)
  • Cost per task (v1/v2) or total evaluation cost (v3)
  • Model type (Base LLM, CoT, Custom, etc.)
  • Model group/family
  • Release date

πŸ‘₯ Who is it for?

πŸ”¬ AI researchers and academics

Tracking the frontier of AI capabilities? Use this actor to collect time-series data on how different model families progress on ARC-AGI benchmarks without manually scraping the leaderboard.

πŸ“Š Data scientists and analysts

Building dashboards comparing LLM capabilities, cost-efficiency frontiers, or vendor performance? Get structured, queryable JSON output for all models across all benchmark versions.

πŸ€– AI product teams and investors

Monitoring competitor model performance on the hardest reasoning benchmarks, tracking cost-efficiency trends, or building automated capability-tracking pipelines.

πŸ“° AI journalists and content creators

Writing about AGI progress? Pull fresh leaderboard data programmatically to power articles, newsletters, or automated reports.

🏫 Educators and course creators

Teaching AI capabilities and limitations? Use live leaderboard data in lectures, assignments, and demos.


πŸš€ Why use it?

  • Direct JSON endpoint β€” ARC Prize exposes clean public JSON endpoints; no HTML parsing or browser automation needed
  • All benchmark versions β€” covers ARC-AGI-1, ARC-AGI-2, and ARC-AGI-3 in a single run
  • Structured output β€” fully typed fields, consistent schema across versions
  • Always fresh β€” fetches live data from arcprize.org on every run
  • Low cost β€” pure HTTP requests, runs in under 30 seconds with minimal compute

πŸ“Š Data fields extracted

FieldTypeDescription
versionstringBenchmark version: v1, v2, or v3
datasetIdstringInternal dataset identifier (e.g., v1_Semi_Private)
datasetDisplayNamestringHuman-readable dataset name (ARC-AGI-1, ARC-AGI-2, ARC-AGI-3)
modelIdstringUnique model identifier
modelDisplayNamestringHuman-readable model name
modelTypestring | nullModel type: Base LLM, CoT, Custom, CoT + Synthesis, etc.
modelGroupstring | nullModel family/group name
providerIdstringProvider identifier (e.g., Anthropic, OpenAI)
providerDisplayNamestringHuman-readable provider name
scorenumberAccuracy score (0–1, where 1.0 = 100%)
costPerTasknumber | nullCost in USD per task solved (v1/v2); null for v3
totalCostnumber | nullTotal evaluation cost in USD (v3); null for v1/v2
modelReleaseDatestring | nullModel release date (ISO 8601)
displaybooleanWhether this entry is shown on the public leaderboard
resultsUrlstringURL to detailed results (if available)
leaderboardUrlstringURL to the ARC Prize leaderboard

πŸ’° How much does it cost?

This scraper uses Pay-Per-Event (PPE) pricing β€” you only pay for entries actually extracted.

What you pay forCost
Run started (one-time)$0.005
Per leaderboard entry extracted$0.0005

Example costs:

  • All 3 benchmark versions (~300 entries total): ~$0.155
  • One benchmark version (~100–150 entries): ~$0.055–0.080
  • Monthly monitoring run (weekly, all versions): ~$0.62/month

βš™οΈ Input configuration

ParameterTypeDefaultDescription
datasetsarray["v1","v2","v3"]Which benchmark versions to scrape
includeHiddenbooleanfalseInclude entries not shown on public leaderboard
maxRequestRetriesinteger3Retry attempts for failed HTTP requests

πŸ“‹ Example input

Scrape all three ARC-AGI benchmark leaderboards:

{
"datasets": ["v1", "v2", "v3"],
"includeHidden": false
}

Scrape only ARC-AGI-2 including hidden entries:

{
"datasets": ["v2"],
"includeHidden": true
}

πŸ“€ Example output

{
"version": "v1",
"datasetId": "v1_Semi_Private",
"datasetDisplayName": "ARC-AGI-1",
"modelId": "Claude 3.7",
"modelDisplayName": "Claude 3.7",
"modelType": "Base LLM",
"modelGroup": null,
"providerId": "Anthropic",
"providerDisplayName": "Anthropic",
"score": 0.136,
"costPerTask": 0.058,
"totalCost": null,
"modelReleaseDate": "2025-02-24T00:00:00.000Z",
"display": true,
"resultsUrl": "",
"leaderboardUrl": "https://arcprize.org/leaderboard"
}

πŸ› οΈ How to use

Follow these steps to get leaderboard data from the Apify Store:

  1. Open the actor β€” go to ARC Prize Leaderboard Scraper on the Apify Store and click Try for free.
  2. Configure input β€” in the Input tab, choose which benchmark versions (v1, v2, v3) to scrape and whether to include hidden entries.
  3. Run β€” click Start and wait for the actor to finish (typically under 30 seconds).
  4. Download results β€” go to the Dataset tab to view extracted entries. Export as JSON, CSV, or JSONL using the Export button, or fetch via the Apify API.
  5. Schedule recurring runs (optional) β€” click Schedule to run automatically (e.g., weekly) and always have fresh leaderboard data.
  6. Connect to downstream tools β€” use the Apify integrations to send data to Google Sheets, Slack, Webhooks, or any HTTP endpoint after each run.

πŸ”— Integrations

Connect ARC Prize Leaderboard Scraper to your existing tools and workflows:

Google Sheets

Use the Apify β†’ Google Sheets integration to automatically append fresh leaderboard data to a spreadsheet after each run. Ideal for building live-updating dashboards or sharing data with your team.

Slack notifications

Trigger a Slack message whenever the leaderboard updates (e.g., a new model breaks a top-10 score). Wire up the Apify β†’ Slack integration in the Integrations tab.

Webhooks

After each run completes, fire a webhook to any HTTP endpoint β€” your own backend, a Zapier/Make workflow, or an n8n automation. Configure in the actor's Integrations tab β†’ Webhook.

Apify API + Python / Node.js

Embed leaderboard scraping in your own data pipeline using the Apify Python client or Node.js client. See the API usage examples section below.

Make (Integromat) / Zapier

Use Apify's native Make and Zapier connectors to route leaderboard data into spreadsheets, databases, or notification services without writing code.

AI agents via MCP

Expose live benchmark data to Claude, Cursor, or VS Code AI features β€” see the MCP integration section below.


πŸ”§ Technical details

  • Architecture: Pure HTTP, no browser needed
  • Source: arcprize.org/media/data/leaderboard/{v1,v2,v3}.json
  • Typical runtime: < 30 seconds
  • Memory: 256 MB
  • Rate limits: Public JSON endpoints, no rate limiting observed

πŸ€– MCP integration (Claude, Cursor, VS Code)

Use this actor as a live data source inside AI coding assistants via the Apify MCP server.

Claude Code (terminal)

Install the Apify MCP server into Claude Code with one command:

$claude mcp add apify -- npx -y @apify/mcp-server@latest

Then set your API token:

$export APIFY_API_KEY=your_apify_api_token

In any Claude conversation you can then ask:

  • "Run the arcprize-leaderboard-scraper actor and show me the top 10 models by score on ARC-AGI-2."
  • "Show me all models from Anthropic on ARC-AGI-1 with their scores, sorted by score descending."
  • "Which models have scores above 50% on ARC-AGI-2? Run the scraper and filter the results."

Claude Desktop

Add the Apify MCP server to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
"mcpServers": {
"apify": {
"command": "npx",
"args": ["-y", "@apify/mcp-server@latest"],
"env": {
"APIFY_API_KEY": "your_apify_api_token"
}
}
}
}

Cursor / VS Code

Add to your MCP settings (.cursor/mcp.json or VS Code MCP config):

{
"mcpServers": {
"apify": {
"command": "npx",
"args": ["-y", "@apify/mcp-server@latest"],
"env": {
"APIFY_API_KEY": "your_apify_api_token"
}
}
}
}

Once configured, your AI assistant can call run_actor with actor ID automation-lab/arcprize-leaderboard-scraper and input like {"datasets": ["v1","v2","v3"]} to fetch live leaderboard data mid-conversation.


πŸ€” FAQ β€” Frequently asked questions

What is ARC-AGI? The Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) is a benchmark created by FranΓ§ois Chollet to measure general reasoning abilities β€” tasks that require pattern recognition and abstract reasoning rather than knowledge retrieval.

What's the difference between ARC-AGI-1, -2, and -3?

  • ARC-AGI-1: Original 2020 benchmark; many top models now exceed 85% accuracy
  • ARC-AGI-2: Harder 2025 version; current best models score under 30%
  • ARC-AGI-3: Hardest 2025/2026 version; frontier models score under 1%

Why are some entries hidden? Hidden entries (display: false) include superseded models, internal test runs, or entries the ARC Prize team chose not to highlight. Enable includeHidden: true to see them.

How often is the leaderboard updated? arcprize.org updates their JSON files when new evaluation results are submitted. Run this actor regularly (e.g., weekly via Apify schedules) to track changes over time.

Why does v3 use totalCost instead of costPerTask? ARC-AGI-3 reports total evaluation cost rather than per-task cost, reflecting the full infrastructure cost of a complete evaluation run.


πŸ’» API usage examples

You can trigger this actor programmatically via the Apify API or SDKs.

Node.js (ApifyClient):

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('automation-lab/arcprize-leaderboard-scraper').call({
datasets: ['v1', 'v2', 'v3'],
includeHidden: false,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python (ApifyClient):

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run = client.actor('automation-lab/arcprize-leaderboard-scraper').call(run_input={
'datasets': ['v1', 'v2', 'v3'],
'includeHidden': False,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

cURL:

curl -X POST \
"https://api.apify.com/v2/acts/automation-lab~arcprize-leaderboard-scraper/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"datasets":["v1","v2","v3"],"includeHidden":false}'


βš–οΈ Legality and terms of use

This actor accesses publicly available JSON endpoints on arcprize.org β€” the same data that powers the public leaderboard website. No authentication is required, and the data is intentionally made public for research and benchmarking transparency.

  • No login, credentials, or bypassing of access controls is involved
  • The data is publicly published by the ARC Prize organization
  • Usage should comply with arcprize.org's terms of service
  • Do not use for commercial redistribution of the data without permission from the ARC Prize Foundation