Deprecated

Pricing

Pay per usage

See alternative Actors

Go to Apify Store

AWS Bedrock Models Scraper

Deprecated

See alternative Actors

Scrapes the full AWS Bedrock model catalog from the official AWS documentation. Returns all foundation models available on Amazon Bedrock with provider name, model name, model card URL, and optionally detailed specs from each model card page.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

6 days ago

Last modified

What does AWS Bedrock Models Scraper do?

AWS Bedrock Models Scraper extracts the complete catalog of foundation models available on Amazon Bedrock — no AWS account, no API key, and no coding required. Run the actor and get structured data for every model including provider name, model ID, context window size, supported modalities, model lifecycle status, launch date, EOL date, knowledge cutoff, and reasoning support.

The actor fetches AWS's official Bedrock documentation using lightweight HTTP requests with Cheerio-based HTML parsing. No browser automation, no Playwright, no proxies required. Every model is returned as a clean JSON record ready to export to CSV, Google Sheets, or any downstream pipeline.

Use this actor to track all available Bedrock foundation models, monitor when new providers or models are added, compare model capabilities across providers, or automate catalog sync for AI platform selection tooling.

Who is it for?

🤖 AWS developers and cloud architects

Find the exact model ID needed for InvokeModel API calls without manually browsing docs
Verify context window sizes, supported modalities, and API endpoints before integrating
Automate checks for new model releases or lifecycle changes (Active → Deprecated)

📊 ML engineers and AI platform teams

Compare capabilities across all Bedrock providers: Anthropic, Amazon, Meta, Mistral, Cohere, DeepSeek, and more
Track which models support reasoning, streaming, and tool calling
Build model selector tooling that stays current with the live Bedrock catalog

💰 FinOps and cost optimization teams

Identify models with the largest context windows for long-document workflows
Track EOL dates to plan model migrations before deprecation
Compare reasoning-capable models across providers for agentic workload planning

🏢 Enterprise AI strategists

Monitor when Amazon adds new providers or removes old ones from Bedrock
Track model lifecycle status changes across the full catalog
Build governance dashboards showing approved models and their availability status

Why use AWS Bedrock Models Scraper?

No AWS credentials required — scrapes the public documentation page only
Covers all providers — AI21 Labs, Amazon Nova, Anthropic Claude, Cohere, DeepSeek, Google, Meta Llama, MiniMax, Mistral, Moonshot, NVIDIA, OpenAI, Qwen, Stability AI, TwelveLabs, Writer, Z.AI, and more
Deep model details — context window, max output tokens, input/output modalities, model ID, lifecycle status, launch date, EOL date, knowledge cutoff, and reasoning support
Lightweight HTTP scraping — no browser automation, no proxy required
Pay-per-event pricing — pay only for models extracted, not idle compute
Schedule and automate — run weekly to track catalog changes and new model additions
Export anywhere — JSON, CSV, Excel, Google Sheets, or push via API and webhook

What data does it extract?

Field	Type	Description
`provider`	string	Provider name (e.g. `Anthropic`, `Meta`, `Amazon`)
`modelName`	string	Display name (e.g. `Claude Sonnet 4.6`, `Llama 3.3 70B Instruct`)
`modelId`	string	API model ID for `InvokeModel` calls (e.g. `anthropic.claude-sonnet-4-6`)
`contextWindow`	string	Maximum context window size (e.g. `1M tokens`, `128K tokens`)
`maxOutputTokens`	string	Maximum output tokens per request (e.g. `64K`)
`inputModalities`	string[]	Supported input types (e.g. `["Text", "Image"]`)
`outputModalities`	string[]	Supported output types (e.g. `["Text"]`)
`modelLifecycle`	string	Lifecycle status (`Active`, `Legacy`, etc.)
`modelLaunchDate`	string	Date model became available (e.g. `Feb 17, 2026`)
`modelEolDate`	string	End-of-life date or `N/A` if not scheduled
`knowledgeCutoff`	string	Training data cutoff date (e.g. `Aug 2025`)
`reasoningSupported`	boolean	Whether the model supports extended reasoning/thinking
`modelCardUrl`	string	Direct URL to the AWS documentation model card
`scrapedAt`	string	ISO timestamp of when data was collected

How much does it cost to scrape AWS Bedrock models?

🟢 Very cheap. The Bedrock catalog currently lists ~105 models across 16+ providers. At $0.001 per model (BRONZE tier), a full run costs approximately $0.11 plus the $0.005 start fee.

Run type	Models	Mode	Estimated cost (BRONZE)
Catalog only (no detail pages)	~105 models	`scrapeModelDetails: false`	~$0.11
Full scrape with model details	~105 models	`scrapeModelDetails: true`	~$0.11
Weekly monitoring (52 runs/year)	~105 models	Full	~$5.75/year
Daily monitoring (365 runs/year)	~105 models	Full	~$40/year

Higher Apify subscription tiers receive discounts (SILVER: $0.00078/model, GOLD: $0.0006/model, up to DIAMOND: $0.00028/model).

All Apify platform compute costs are included. No proxy fees since the page is publicly accessible.

How to use AWS Bedrock Models Scraper

Step 1: Open the actor on Apify Store.

Step 2: Click Try for free — no configuration needed. By default the actor scrapes all models and their detail pages.

Step 3: Click Start and wait 1–3 minutes (detail page scraping fetches ~70 pages).

Step 4: View results in the Dataset tab. Export to JSON, CSV, or Excel.

Step 5 (optional): Disable Scrape model details for a faster catalog-only run (returns provider/name/URL only without individual model specs).

Step 6 (optional): Schedule recurring runs in the Schedules tab to track catalog changes over time.

Input parameters

Parameter	Type	Default	Description
`scrapeModelDetails`	boolean	`true`	Fetch each model's detail page for full specs (model ID, context window, modalities, lifecycle, etc.). Set to `false` for a fast catalog-only list.
`maxRequestRetries`	integer	`3`	Number of retry attempts for failed HTTP requests

The actor requires no mandatory input. Clicking Start with defaults returns the full model catalog with all available details.

Output example

{
  "provider": "Anthropic",
  "modelName": "Claude Sonnet 4.6",
  "modelCardUrl": "https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-anthropic-claude-sonnet-4-6.html",
  "modelId": "anthropic.claude-sonnet-4-6",
  "contextWindow": "1M tokens",
  "maxOutputTokens": "64K",
  "inputModalities": ["Image", "Text"],
  "outputModalities": ["Text"],
  "modelLifecycle": "Active",
  "modelLaunchDate": "Feb 17, 2026",
  "modelEolDate": "N/A",
  "knowledgeCutoff": "Aug 2025",
  "reasoningSupported": true,
  "scrapedAt": "2026-04-28T08:00:00.000Z"
}

Catalog-only mode example (scrapeModelDetails: false):

{
  "provider": "Meta",
  "modelName": "Llama 3.3 70B Instruct",
  "modelCardUrl": "https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-meta-llama-3-3-70b-instruct.html",
  "modelId": null,
  "contextWindow": null,
  "maxOutputTokens": null,
  "inputModalities": [],
  "outputModalities": [],
  "modelLifecycle": null,
  "modelLaunchDate": null,
  "modelEolDate": null,
  "knowledgeCutoff": null,
  "reasoningSupported": null,
  "scrapedAt": "2026-04-28T08:00:00.000Z"
}

Tips and best practices

💡 Use catalog-only mode for fast lookups — Set scrapeModelDetails: false to get the full provider/model name list with URLs in seconds. Ideal for deduplication checks or when you only need to know which models exist.

💡 Schedule for catalog change detection — AWS adds new models and providers to Bedrock frequently. Schedule weekly runs and compare datasets to detect new additions or lifecycle changes automatically.

💡 Filter by reasoningSupported: true for agentic workloads — Use this field to quickly identify models that support extended reasoning/thinking mode for complex multi-step tasks.

💡 Use modelId directly in AWS SDK calls — The extracted modelId (e.g. anthropic.claude-sonnet-4-6) is the value you pass to InvokeModel or Converse API calls without needing to navigate documentation.

💡 Track modelEolDate for migration planning — Monitor when models enter end-of-life to proactively migrate workloads. Null or "N/A" means no deprecation is scheduled yet.

💡 Combine with pricing data — AWS Bedrock pricing is on a separate page. Pair this catalog data with pricing lookups to build a full cost/capability comparison across providers.

Integrations

🔄 Zapier / Make.com — Trigger downstream workflows when new providers or models appear. Connect Bedrock model data to Slack notifications, Airtable records, or Google Sheets automatically.

📊 Google Sheets — Export results directly to a Google Sheet for team-visible model catalogs and capability matrices. Use Apify's native Google Sheets integration or schedule the actor and auto-push results.

🤖 Model selector tooling — Feed model catalog data into your own AI platform selection tools. Use modelId, contextWindow, and inputModalities to power dynamic model pickers in internal apps.

🗄️ Data pipelines — Push results to PostgreSQL, BigQuery, or Snowflake via Apify webhooks for long-term catalog trend analysis and model lifecycle tracking.

📬 Webhook notifications — Set up Apify webhooks to POST results to your endpoint whenever a run completes, enabling real-time catalog sync in your infrastructure.

API usage

Node.js (Apify client)

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });

const run = await client.actor('automation-lab/aws-bedrock-models-scraper').call({
    scrapeModelDetails: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Extracted ${items.length} Bedrock models`);
items.forEach(m => {
    console.log(`${m.provider} — ${m.modelName} (${m.modelId}): context ${m.contextWindow}`);
});

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run = client.actor("automation-lab/aws-bedrock-models-scraper").call(run_input={
    "scrapeModelDetails": True
})

items = client.dataset(run["defaultDatasetId"]).list_items().items
print(f"Extracted {len(items)} Bedrock models")
for m in items:
    print(f"{m['provider']} — {m['modelName']} ({m.get('modelId')}): context {m.get('contextWindow')}")

cURL

# Start the actor
curl -X POST \
  "https://api.apify.com/v2/acts/automation-lab~aws-bedrock-models-scraper/runs?token=YOUR_APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"scrapeModelDetails": true}'

# Get results (replace DATASET_ID with the run's defaultDatasetId)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_APIFY_TOKEN"

Use with Claude AI (MCP)

You can use AWS Bedrock Models Scraper directly inside Claude via the Apify MCP server.

Claude Code (terminal)

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/aws-bedrock-models-scraper"

Claude Desktop / Cursor / VS Code

Add to your MCP config:

{
  "mcpServers": {
    "apify": {
      "type": "http",
      "url": "https://mcp.apify.com?tools=automation-lab/aws-bedrock-models-scraper",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Example prompts to use with Claude:

"Use the AWS Bedrock Models Scraper to list all models that support reasoning and have a context window over 100K tokens."
"Scrape the Bedrock catalog and show me all Anthropic Claude models currently in Active lifecycle."
"Run the Bedrock scraper and find all models with image input modality support."

Legality and terms of service

This actor scrapes publicly accessible data from the AWS Bedrock documentation (docs.aws.amazon.com). The page is publicly accessible without authentication and contains no personal data — only product/model specifications.

Scraping publicly available documentation for informational, research, and competitive intelligence purposes is generally accepted under fair use principles. Always review AWS Terms of Service before using the data commercially.

The actor does not log in, bypass access controls, or scrape any user-specific or private AWS data.

FAQ

Q: How often should I run this actor to stay current? A: AWS adds new models and providers to Bedrock every few weeks. A weekly schedule is sufficient for most use cases. Set up a schedule with email or webhook notifications to catch additions automatically.

Q: Why are some fields null? A: When scrapeModelDetails is disabled, all detail fields are null (only catalog data is returned). With details enabled, a field may be null if AWS's documentation page doesn't include that information for a specific model (common for newer or preview models).

Q: How long does a full scrape take? A: With scrapeModelDetails: true, the actor fetches ~70+ individual model card pages sequentially. Expect 1–3 minutes for a complete run depending on network latency.

Q: Can I scrape just one provider's models? A: Not directly — the actor scrapes the full catalog. You can filter results after the run using the provider field to focus on a specific provider (e.g. filter where provider === "Anthropic").

Q: The actor returned empty results or threw an error. What should I do? A: AWS periodically updates their documentation structure. If the actor returns 0 items or throws an error about "No models found," please open an issue in the actor's review section. You can also try increasing maxRequestRetries to 5 to handle transient network issues.

Q: Can I get a historical view of which models were available on a given date? A: Yes — schedule recurring runs and each run's dataset includes the scrapedAt timestamp. Query across multiple dataset exports to build a timeline of catalog changes.

🔗 Groq Models Scraper — All AI models on Groq with pricing and speed benchmarks
🔗 OpenRouter Models Scraper — Full OpenRouter model catalog with pricing and capability metadata
🔗 Mistral Models Scraper — Mistral AI model catalog with pricing and context window data
🔗 Together AI Models Scraper — Together AI model catalog with pricing and availability
🔗 DeepInfra Models Scraper — DeepInfra model catalog with pricing and throughput data

AI Model Pricing Tracker

lulzasaur/ai-pricing-scraper

Track and compare AI model pricing across 10+ providers. Get input/output token costs for OpenAI, Anthropic, Google, Mistral, Cohere, DeepSeek, Meta and more. Detect price changes over time.

lulz bot

LLM Data Pipeline Pro

sanztheo/llm-data-pipeline-pro

Transform websites into LLM training data. Scrape, validate, deduplicate, chunk for RAG, and export to OpenAI/Anthropic/Mistral formats. Built-in PII detection and GDPR compliance. Vector DB export to Pinecone & Qdrant.

Theo Sanz

AI Vendor Intelligence Agent

ramsford/ai-vendor-intel

Track pricing changes, new model launches, feature updates, outages and security incidents across major AI vendors (OpenAI, Anthropic, Google, Azure AI, Mistral, Cohere, AWS Bedrock, xAI). One report per vendor with sourced citations. Pay per report.

Don Johnson

Website Blueprint Prompter

heyibad/website-blueprint-prompter

Turn any website into AI-ready prompts. This Actor crawls JS-rendered pages, extracts design tokens and assets, detects the tech stack, and generates structured blueprints for AI code generation and fast prototyping.

Muhammad Ibad Ansari

LLM Pricing Monitor

devilscrapes/llm-pricing-monitor

Scrape live LLM API pricing from OpenAI, Anthropic, Google, Mistral, Groq, Together AI and DeepSeek. Normalized per-million-token output for cost dashboards and FinOps pipelines. Beats OpenRouter-only competitor by 6.5x.

DevilScrapes

Airtable Lead Enricher

datahq/airtable-lead-enricher

Stop manual lead research. This actor enriches your Airtable leads with contact data and AI scores, then updates your base.

DataHQ

📊 Sentiment Analyzer — Fast VADER Scoring for Reviews & Social

nexgendata/ai-sentiment-analyzer

Score sentiment on reviews, social posts, support tickets, and any English text. Uses VADER (rule-based, deterministic, optimized for informal short text). Fast, predictable cost, no API keys or LLM fees. Best for high-volume scoring and social media monitoring.

Stephan Corbeil

Funded Startup Tracker � TechCrunch + SEC EDGAR Fusion

george.the.developer/funded-startup-tracker

Recent funding events from TechCrunch and SEC EDGAR with parsed amount, round, investors. Founders, hiring signals, cross-referenced sources. Pay per result, no monthly seats.

George Kioko

itch.io Game Scraper

parseforge/itch-io-scraper

Scrape indie game listings from itch.io including title, developer, price, rating, platforms, genre, and more.

ParseForge

NHTSA Vehicle Complaints Scraper

parseforge/nhtsa-vehicle-complaints-scraper

Unlock public records from Nhtsa Vehicle Complaints with identifiers, dates, parties, descriptions, status flags and direct links to source filings. Designed for compliance, government affairs, due diligence and policy research. Run on demand or on a recurring schedule and feed every row into you.

ParseForge

Youtube Channel Scraper

scrapers-hub/youtube-channel-scraper

YouTube channel scraper to extract channel data, videos, subscribers, and metadata 📊📺 Perfect for competitor analysis, influencer research, and content strategy. Fast and reliable data extraction.