Deprecated

Pricing

Pay per event

See alternative Actors

Go to Apify Store

Cloudflare Workers AI Models Scraper

Deprecated

See alternative Actors

Scrapes the full catalog of AI models available on Cloudflare Workers AI, including model IDs, task categories, descriptions, context lengths, and capabilities.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

8 days ago

Last modified

What does Cloudflare Workers AI Models Scraper do?

Cloudflare Workers AI Models Scraper extracts the complete catalog of AI models available on Cloudflare Workers AI — no API key, no login, and no coding required. Run the actor and get structured data for every model including model IDs, task categories, context lengths, async support, and hosting details.

Cloudflare Workers AI is Cloudflare's edge-hosted AI inference platform, running 97+ models globally across text generation, image generation, speech recognition, embeddings, translation, and more. The actor fetches their official models catalog using a single HTTP request, parses the server-rendered data, and outputs clean JSON records ready to export or integrate into any pipeline.

Use this actor to monitor Cloudflare's model catalog changes, track when new models are added or deprecated, or build comparisons across edge AI providers.

Who is it for?

🤖 AI developers building on Cloudflare Workers

Find the exact model ID before integrating Cloudflare Workers AI into your application
Verify context lengths and async support for planning request budgets
Automate checks for new model releases or deprecations

📊 ML researchers and data scientists

Track which models Cloudflare adds to their edge inference platform over time
Compare model availability across Cloudflare, Groq, DeepInfra, and OpenRouter
Build datasets for cross-provider model capability analysis

💰 Platform teams and FinOps

Monitor model availability changes that might affect your production stack
Filter models by task category to discover alternatives for your use case
Track deprecated models before they break your integration

🏢 AI product managers and strategists

Understand what AI capabilities Cloudflare offers for edge deployment
Identify emerging model categories being added to the platform
Build dashboards comparing Cloudflare Workers AI to other inference providers

Why use Cloudflare Workers AI Models Scraper?

No API key required — Cloudflare's models catalog is fully public
Single HTTP request — fetches all 97+ models in one call with zero JS rendering
Zero proxy cost — no browser automation or residential proxies needed
Covers all model types — text generation, image, embedding, speech, translation, object detection, image classification, and more
Task category filter — narrow results to a specific category like "Text Generation" or "Text-to-Image"
Deprecated model filter — optionally include or exclude deprecated models
Pay-per-event pricing — pay only for models extracted, not idle compute time
Schedule and automate — run daily or weekly to track catalog changes over time
Export anywhere — JSON, CSV, Excel, Google Sheets, or push via API and webhook

What data does it extract?

Field	Type	Description
`modelId`	string	API identifier (e.g. `@cf/meta/llama-3.3-70b-instruct`)
`displayName`	string	Human-readable model name
`description`	string	Full model description
`taskName`	string	Task category (e.g. Text Generation, Text-to-Image)
`taskDescription`	string	Description of the task category
`contextLength`	number \| null	Context window size in tokens (null if not applicable)
`maxOutputTokens`	number \| null	Maximum output tokens per request (null if not applicable)
`supportsAsync`	boolean	Whether the model supports asynchronous execution
`hosting`	string	Hosting type (e.g. `hosted`)
`tags`	string[]	Model tags (e.g. `deprecated`, `recommended`)
`sourceUrl`	string	Source URL of the Cloudflare Workers AI models catalog

How much does it cost to scrape Cloudflare Workers AI models?

🟢 Very cheap. Cloudflare's catalog currently lists 97 models. At $0.002 per model extracted, a full run costs approximately $0.19 plus the $0.005 start fee — under $0.20 total.

Run type	Models	Estimated cost
One-time full scrape	~97 models	~$0.20
Weekly monitoring (52 runs/year)	~97 models	~$10/year
Daily monitoring (365 runs/year)	~97 models	~$73/year
Filtered (Text Generation only)	~60 models	~$0.13/run

All Apify platform compute costs are included. No proxy fees since the page is accessible without proxies.

How to use Cloudflare Workers AI Models Scraper

Step 1: Open the actor on Apify Store.

Step 2: Click Try for free — no configuration needed. The actor scrapes all models by default.

Step 3: Click Start and wait 5–15 seconds.

Step 4: View results in the Dataset tab. Export to JSON, CSV, or Excel.

Step 5 (optional): Set the Task category filter to narrow down results (e.g. Text Generation to get only LLMs).

Step 6 (optional): Schedule recurring runs in the Schedules tab to monitor catalog changes over time.

Input parameters

Parameter	Type	Default	Description
`taskCategory`	string	`""` (all)	Filter by task category name. Case-insensitive exact match. Examples: `Text Generation`, `Text-to-Image`, `Text Embeddings`, `Speech-to-Text`, `Text-to-Speech`, `Translation`, `Object Detection`
`includeDeprecated`	boolean	`false`	Include deprecated models in results. Deprecated models are tagged with `deprecated` and excluded by default.
`maxRequestRetries`	integer	`3`	Number of retry attempts if the Cloudflare models page fails to load.

No mandatory input is required. Clicking Start with defaults returns all non-deprecated models.

Output example

{
  "modelId": "@cf/meta/llama-3.3-70b-instruct",
  "displayName": "@cf/meta/llama-3.3-70b-instruct",
  "description": "Llama 3.3 is a 70B parameter multilingual large language model...",
  "taskName": "Text Generation",
  "taskDescription": "Family of generative text models, such as large language models (LLM)...",
  "contextLength": 131072,
  "maxOutputTokens": null,
  "supportsAsync": true,
  "hosting": "hosted",
  "tags": [],
  "sourceUrl": "https://developers.cloudflare.com/workers-ai/models/"
}

Embedding model example:

{
  "modelId": "@cf/baai/bge-base-en-v1.5",
  "displayName": "@cf/baai/bge-base-en-v1.5",
  "description": "BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector",
  "taskName": "Text Embeddings",
  "taskDescription": "Feature extraction models transform raw data into numerical features...",
  "contextLength": 153600,
  "maxOutputTokens": null,
  "supportsAsync": true,
  "hosting": "hosted",
  "tags": [],
  "sourceUrl": "https://developers.cloudflare.com/workers-ai/models/"
}

Tips and best practices

💡 Schedule for catalog monitoring — Cloudflare frequently adds new models and deprecates old ones. Schedule weekly runs and compare datasets to track changes automatically.

💡 Use taskCategory to filter — Set taskCategory to Text Generation to get only LLMs, or Text Embeddings to find all embedding models. This reduces cost when you only need a subset.

💡 Set includeDeprecated to true for audits — When auditing your integrations, enable deprecated model inclusion to see the full catalog including models being phased out.

💡 Combine with other model scrapers — Pair this actor with Groq Models Scraper or OpenRouter Models Scraper to build a comprehensive cross-provider model catalog.

💡 Check supportsAsync for production planning — Async execution support matters for batch inference workloads. Filter supportsAsync: true models for high-throughput use cases.

Integrations

🔄 Zapier / Make.com — Trigger downstream workflows when new models appear in Cloudflare's catalog. Connect model data to Slack notifications, Airtable, or Google Sheets automatically.

📊 Google Sheets — Export results directly to a Google Sheet for team-visible model tracking dashboards. Use Apify's native Google Sheets integration or schedule the actor and auto-push results.

🤖 AI platforms — Feed model catalog data into your own model selection tools or cost calculators to power real-time model availability checks in your app.

🗄️ Data pipelines — Push results to PostgreSQL, BigQuery, or Snowflake via Apify webhooks for long-term catalog trend analysis.

📬 Webhook notifications — Set up Apify webhooks to POST results to your endpoint whenever a run completes, enabling real-time model catalog updates in your infrastructure.

API usage

Node.js (Apify client)

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });

const run = await client.actor('automation-lab/cloudflare-workers-ai-scraper').call({
    taskCategory: 'Text Generation',
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Extracted ${items.length} Cloudflare Workers AI models`);
items.forEach(m => {
    console.log(`${m.modelId}: context=${m.contextLength}`);
});

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run = client.actor("automation-lab/cloudflare-workers-ai-scraper").call(
    run_input={"taskCategory": "Text Generation"}
)

items = client.dataset(run["defaultDatasetId"]).list_items().items
print(f"Extracted {len(items)} Cloudflare Workers AI models")
for m in items:
    print(f"{m['modelId']}: contextLength={m.get('contextLength')}")

cURL

# Start the actor
curl -X POST \
  "https://api.apify.com/v2/acts/automation-lab~cloudflare-workers-ai-scraper/runs?token=YOUR_APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"taskCategory": "Text Generation"}'

# Get results (replace DATASET_ID with the run's defaultDatasetId)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_APIFY_TOKEN"

Use with Claude AI (MCP)

You can use Cloudflare Workers AI Models Scraper directly inside Claude via the Apify MCP server.

Claude Code (terminal)

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/cloudflare-workers-ai-scraper"

Claude Desktop / Cursor / VS Code

Add to your MCP config:

{
  "mcpServers": {
    "apify": {
      "type": "http",
      "url": "https://mcp.apify.com?tools=automation-lab/cloudflare-workers-ai-scraper",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Example prompts to use with Claude:

"Use the Cloudflare Workers AI scraper to list all available text generation models and their context lengths."
"Scrape Cloudflare Workers AI models and tell me which embedding models support async execution."
"Run the Cloudflare Workers AI scraper and compare the model types available to what Groq offers."

Legality and terms of service

This actor scrapes publicly accessible data from Cloudflare's developer documentation (developers.cloudflare.com/workers-ai/models/). The page is fully public, requires no authentication, and contains no personal data.

Scraping publicly available developer documentation for informational, research, and competitive intelligence purposes is generally accepted under fair use principles. Always review Cloudflare's Terms of Service before using the data commercially.

The actor does not log in, bypass access controls, or scrape any user-specific or private data.

FAQ

Q: How often should I run this actor to stay up to date? A: Cloudflare adds new models and deprecates old ones every few weeks. A weekly schedule is sufficient for most use cases. For tighter monitoring, run daily.

Q: Why are some fields null for certain models? A: Not all fields apply to every model type. Image generation and speech models don't have context lengths — those fields are null. The contextLength and maxOutputTokens fields are only populated for text models where they're applicable.

Q: The actor returned fewer models than I expected. What happened? A: If you set includeDeprecated: false (the default), deprecated models are excluded. Enable includeDeprecated: true to see the full catalog including deprecated entries.

Q: I'm getting an error or empty results. What should I do? A: Cloudflare occasionally updates their documentation page structure. If the actor returns 0 items or throws an error, please open an issue in the actor's review section. You can also try increasing maxRequestRetries to 5 to handle transient network issues.

Q: Can I filter by multiple task categories? A: Currently the actor filters by a single task category at a time. To get models from multiple categories, run the actor multiple times with different taskCategory values, or omit the filter to get all models and filter the output yourself.

Q: Can I get historical data to track model additions? A: Yes — schedule recurring runs and each run's dataset captures a snapshot. Compare datasets across runs to track when new models are added or deprecated.

🔗 Groq Models Scraper — All Groq LLM models with pricing and speed benchmarks
🔗 OpenRouter Models Scraper — All AI models available on OpenRouter with pricing and capability metadata
🔗 DeepInfra Models Scraper — Models and pricing from the DeepInfra platform
🔗 Fireworks AI Scraper — Models and pricing from Fireworks AI
🔗 Replicate Scraper — Models and pricing from the Replicate platform

Crunchbase Company Organization Scraper

saswave/crunchbase-company-organization-scraper

Extract rich, structured company data from Crunchbase organization profiles. Ideal for market research, lead generation, competitive analysis, and investment intelligence.

SASWAVE

B2B Buyer Lead Extractor

george.the.developer/saas-buyer-extractor

Find B2B buyer leads from SaaS review sites. Extract decision-maker contacts, company data, and tech stack signals for sales outreach.

George Kioko

Webpage Content Scraper to Markdown

riisager/tulabot-cloudflare-markdown

Focus on cost, Scrape any webpage content into LLM-ready Markdown for RAG. Uses a smart hybrid 6 tier engine: Apify for crawling + Cloudflare Browser API Rendering for perfect extraction. Automatically saves costs by detecting native markdown support.

Søren Riisager

Lamudi.com.ph Scraper | Philippines Real Estate Listings

haketa/lamudi-scraper

Scrape Lamudi.com.ph, the Philippines' top property portal. Houses, condos, land & commercial in Manila, Cebu & Davao. Price in PHP, PRC broker license, Clean Title status & barangay-level location.

Haketa

OLX Car Listings Scraper - 6 Countries, JSON Output

extractify-labs/olx-cars

Scrape car listings from OLX across Romania, Poland, Bulgaria, Portugal, Ukraine, and Kazakhstan. Structured JSON with price, make, model, year, mileage, photos, and seller info. Optional NHTSA vPIC VIN decoding adds make/model/engine/plant for listings that disclose a VIN. No proxy.

Extractify Labs

BuiltIn.com Tech Companies & Tech Stack Scraper

haketa/builtin-tech-companies-scraper

Scrape Built In company profiles — name, industry, location, recent jobs, full tech stack with category labels (LANGUAGES / FRAMEWORKS / DATABASES / etc.). Unique technology-fingerprint data for B2B SaaS prospecting, recruiter intel and competitive analysis. HTTP-only.

Haketa

Airbnb Market Analytics — ADR, RevPAR & Occupancy

makework36/airbnb-market-data

Airbnb market intelligence API for short-term rental investors. Pull ADR, occupancy signals, nightly price distribution, Superhost density and rating benchmarks for any city or neighborhood. No login — JSON and CSV export.

deusex machine

Fresh HN Feed

prince_gabriel/fresh-hn-feed

Freshness-ranked Hacker News feed powered by FreshContext DAR scoring.

Immanuel Gabriel

Openclaw Agentforge Compare

yagamiyedan/openclaw-agentforge-compare

yagami yedan

Openclaw Intel

yagamiyedan/openclaw-intel

yagami yedan

YouTube Comment Scraper

scrapesmith/youtube-comment-scraper

Scrape millions of comments from any YouTube video with no API key. Supports all URL formats. Returns author info, like counts, reply counts, publish time, and creator hearts.