AWS Bedrock Models Scraper
DeprecatedPricing
Pay per usage
AWS Bedrock Models Scraper
DeprecatedScrapes the full AWS Bedrock model catalog from the official AWS documentation. Returns all foundation models available on Amazon Bedrock with provider name, model name, model card URL, and optionally detailed specs from each model card page.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
What does AWS Bedrock Models Scraper do?
AWS Bedrock Models Scraper extracts the complete catalog of foundation models available on Amazon Bedrock — no AWS account, no API key, and no coding required. Run the actor and get structured data for every model including provider name, model ID, context window size, supported modalities, model lifecycle status, launch date, EOL date, knowledge cutoff, and reasoning support.
The actor fetches AWS's official Bedrock documentation using lightweight HTTP requests with Cheerio-based HTML parsing. No browser automation, no Playwright, no proxies required. Every model is returned as a clean JSON record ready to export to CSV, Google Sheets, or any downstream pipeline.
Use this actor to track all available Bedrock foundation models, monitor when new providers or models are added, compare model capabilities across providers, or automate catalog sync for AI platform selection tooling.
Who is it for?
🤖 AWS developers and cloud architects
- Find the exact model ID needed for
InvokeModelAPI calls without manually browsing docs - Verify context window sizes, supported modalities, and API endpoints before integrating
- Automate checks for new model releases or lifecycle changes (Active → Deprecated)
📊 ML engineers and AI platform teams
- Compare capabilities across all Bedrock providers: Anthropic, Amazon, Meta, Mistral, Cohere, DeepSeek, and more
- Track which models support reasoning, streaming, and tool calling
- Build model selector tooling that stays current with the live Bedrock catalog
💰 FinOps and cost optimization teams
- Identify models with the largest context windows for long-document workflows
- Track EOL dates to plan model migrations before deprecation
- Compare reasoning-capable models across providers for agentic workload planning
🏢 Enterprise AI strategists
- Monitor when Amazon adds new providers or removes old ones from Bedrock
- Track model lifecycle status changes across the full catalog
- Build governance dashboards showing approved models and their availability status
Why use AWS Bedrock Models Scraper?
- No AWS credentials required — scrapes the public documentation page only
- Covers all providers — AI21 Labs, Amazon Nova, Anthropic Claude, Cohere, DeepSeek, Google, Meta Llama, MiniMax, Mistral, Moonshot, NVIDIA, OpenAI, Qwen, Stability AI, TwelveLabs, Writer, Z.AI, and more
- Deep model details — context window, max output tokens, input/output modalities, model ID, lifecycle status, launch date, EOL date, knowledge cutoff, and reasoning support
- Lightweight HTTP scraping — no browser automation, no proxy required
- Pay-per-event pricing — pay only for models extracted, not idle compute
- Schedule and automate — run weekly to track catalog changes and new model additions
- Export anywhere — JSON, CSV, Excel, Google Sheets, or push via API and webhook
What data does it extract?
| Field | Type | Description |
|---|---|---|
provider | string | Provider name (e.g. Anthropic, Meta, Amazon) |
modelName | string | Display name (e.g. Claude Sonnet 4.6, Llama 3.3 70B Instruct) |
modelId | string | API model ID for InvokeModel calls (e.g. anthropic.claude-sonnet-4-6) |
contextWindow | string | Maximum context window size (e.g. 1M tokens, 128K tokens) |
maxOutputTokens | string | Maximum output tokens per request (e.g. 64K) |
inputModalities | string[] | Supported input types (e.g. ["Text", "Image"]) |
outputModalities | string[] | Supported output types (e.g. ["Text"]) |
modelLifecycle | string | Lifecycle status (Active, Legacy, etc.) |
modelLaunchDate | string | Date model became available (e.g. Feb 17, 2026) |
modelEolDate | string | End-of-life date or N/A if not scheduled |
knowledgeCutoff | string | Training data cutoff date (e.g. Aug 2025) |
reasoningSupported | boolean | Whether the model supports extended reasoning/thinking |
modelCardUrl | string | Direct URL to the AWS documentation model card |
scrapedAt | string | ISO timestamp of when data was collected |
How much does it cost to scrape AWS Bedrock models?
🟢 Very cheap. The Bedrock catalog currently lists ~105 models across 16+ providers. At $0.001 per model (BRONZE tier), a full run costs approximately $0.11 plus the $0.005 start fee.
| Run type | Models | Mode | Estimated cost (BRONZE) |
|---|---|---|---|
| Catalog only (no detail pages) | ~105 models | scrapeModelDetails: false | ~$0.11 |
| Full scrape with model details | ~105 models | scrapeModelDetails: true | ~$0.11 |
| Weekly monitoring (52 runs/year) | ~105 models | Full | ~$5.75/year |
| Daily monitoring (365 runs/year) | ~105 models | Full | ~$40/year |
Higher Apify subscription tiers receive discounts (SILVER: $0.00078/model, GOLD: $0.0006/model, up to DIAMOND: $0.00028/model).
All Apify platform compute costs are included. No proxy fees since the page is publicly accessible.
How to use AWS Bedrock Models Scraper
Step 1: Open the actor on Apify Store.
Step 2: Click Try for free — no configuration needed. By default the actor scrapes all models and their detail pages.
Step 3: Click Start and wait 1–3 minutes (detail page scraping fetches ~70 pages).
Step 4: View results in the Dataset tab. Export to JSON, CSV, or Excel.
Step 5 (optional): Disable Scrape model details for a faster catalog-only run (returns provider/name/URL only without individual model specs).
Step 6 (optional): Schedule recurring runs in the Schedules tab to track catalog changes over time.
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
scrapeModelDetails | boolean | true | Fetch each model's detail page for full specs (model ID, context window, modalities, lifecycle, etc.). Set to false for a fast catalog-only list. |
maxRequestRetries | integer | 3 | Number of retry attempts for failed HTTP requests |
The actor requires no mandatory input. Clicking Start with defaults returns the full model catalog with all available details.
Output example
{"provider": "Anthropic","modelName": "Claude Sonnet 4.6","modelCardUrl": "https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-anthropic-claude-sonnet-4-6.html","modelId": "anthropic.claude-sonnet-4-6","contextWindow": "1M tokens","maxOutputTokens": "64K","inputModalities": ["Image", "Text"],"outputModalities": ["Text"],"modelLifecycle": "Active","modelLaunchDate": "Feb 17, 2026","modelEolDate": "N/A","knowledgeCutoff": "Aug 2025","reasoningSupported": true,"scrapedAt": "2026-04-28T08:00:00.000Z"}
Catalog-only mode example (scrapeModelDetails: false):
{"provider": "Meta","modelName": "Llama 3.3 70B Instruct","modelCardUrl": "https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-meta-llama-3-3-70b-instruct.html","modelId": null,"contextWindow": null,"maxOutputTokens": null,"inputModalities": [],"outputModalities": [],"modelLifecycle": null,"modelLaunchDate": null,"modelEolDate": null,"knowledgeCutoff": null,"reasoningSupported": null,"scrapedAt": "2026-04-28T08:00:00.000Z"}
Tips and best practices
💡 Use catalog-only mode for fast lookups — Set scrapeModelDetails: false to get the full provider/model name list with URLs in seconds. Ideal for deduplication checks or when you only need to know which models exist.
💡 Schedule for catalog change detection — AWS adds new models and providers to Bedrock frequently. Schedule weekly runs and compare datasets to detect new additions or lifecycle changes automatically.
💡 Filter by reasoningSupported: true for agentic workloads — Use this field to quickly identify models that support extended reasoning/thinking mode for complex multi-step tasks.
💡 Use modelId directly in AWS SDK calls — The extracted modelId (e.g. anthropic.claude-sonnet-4-6) is the value you pass to InvokeModel or Converse API calls without needing to navigate documentation.
💡 Track modelEolDate for migration planning — Monitor when models enter end-of-life to proactively migrate workloads. Null or "N/A" means no deprecation is scheduled yet.
💡 Combine with pricing data — AWS Bedrock pricing is on a separate page. Pair this catalog data with pricing lookups to build a full cost/capability comparison across providers.
Integrations
🔄 Zapier / Make.com — Trigger downstream workflows when new providers or models appear. Connect Bedrock model data to Slack notifications, Airtable records, or Google Sheets automatically.
📊 Google Sheets — Export results directly to a Google Sheet for team-visible model catalogs and capability matrices. Use Apify's native Google Sheets integration or schedule the actor and auto-push results.
🤖 Model selector tooling — Feed model catalog data into your own AI platform selection tools. Use modelId, contextWindow, and inputModalities to power dynamic model pickers in internal apps.
🗄️ Data pipelines — Push results to PostgreSQL, BigQuery, or Snowflake via Apify webhooks for long-term catalog trend analysis and model lifecycle tracking.
📬 Webhook notifications — Set up Apify webhooks to POST results to your endpoint whenever a run completes, enabling real-time catalog sync in your infrastructure.
API usage
Node.js (Apify client)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('automation-lab/aws-bedrock-models-scraper').call({scrapeModelDetails: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Extracted ${items.length} Bedrock models`);items.forEach(m => {console.log(`${m.provider} — ${m.modelName} (${m.modelId}): context ${m.contextWindow}`);});
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("automation-lab/aws-bedrock-models-scraper").call(run_input={"scrapeModelDetails": True})items = client.dataset(run["defaultDatasetId"]).list_items().itemsprint(f"Extracted {len(items)} Bedrock models")for m in items:print(f"{m['provider']} — {m['modelName']} ({m.get('modelId')}): context {m.get('contextWindow')}")
cURL
# Start the actorcurl -X POST \"https://api.apify.com/v2/acts/automation-lab~aws-bedrock-models-scraper/runs?token=YOUR_APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"scrapeModelDetails": true}'# Get results (replace DATASET_ID with the run's defaultDatasetId)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_APIFY_TOKEN"
Use with Claude AI (MCP)
You can use AWS Bedrock Models Scraper directly inside Claude via the Apify MCP server.
Claude Code (terminal)
$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/aws-bedrock-models-scraper"
Claude Desktop / Cursor / VS Code
Add to your MCP config:
{"mcpServers": {"apify": {"type": "http","url": "https://mcp.apify.com?tools=automation-lab/aws-bedrock-models-scraper","headers": {"Authorization": "Bearer YOUR_APIFY_TOKEN"}}}}
Example prompts to use with Claude:
- "Use the AWS Bedrock Models Scraper to list all models that support reasoning and have a context window over 100K tokens."
- "Scrape the Bedrock catalog and show me all Anthropic Claude models currently in Active lifecycle."
- "Run the Bedrock scraper and find all models with image input modality support."
Legality and terms of service
This actor scrapes publicly accessible data from the AWS Bedrock documentation (docs.aws.amazon.com). The page is publicly accessible without authentication and contains no personal data — only product/model specifications.
Scraping publicly available documentation for informational, research, and competitive intelligence purposes is generally accepted under fair use principles. Always review AWS Terms of Service before using the data commercially.
The actor does not log in, bypass access controls, or scrape any user-specific or private AWS data.
FAQ
Q: How often should I run this actor to stay current? A: AWS adds new models and providers to Bedrock every few weeks. A weekly schedule is sufficient for most use cases. Set up a schedule with email or webhook notifications to catch additions automatically.
Q: Why are some fields null?
A: When scrapeModelDetails is disabled, all detail fields are null (only catalog data is returned). With details enabled, a field may be null if AWS's documentation page doesn't include that information for a specific model (common for newer or preview models).
Q: How long does a full scrape take?
A: With scrapeModelDetails: true, the actor fetches ~70+ individual model card pages sequentially. Expect 1–3 minutes for a complete run depending on network latency.
Q: Can I scrape just one provider's models?
A: Not directly — the actor scrapes the full catalog. You can filter results after the run using the provider field to focus on a specific provider (e.g. filter where provider === "Anthropic").
Q: The actor returned empty results or threw an error. What should I do?
A: AWS periodically updates their documentation structure. If the actor returns 0 items or throws an error about "No models found," please open an issue in the actor's review section. You can also try increasing maxRequestRetries to 5 to handle transient network issues.
Q: Can I get a historical view of which models were available on a given date?
A: Yes — schedule recurring runs and each run's dataset includes the scrapedAt timestamp. Query across multiple dataset exports to build a timeline of catalog changes.
Related scrapers
- 🔗 Groq Models Scraper — All AI models on Groq with pricing and speed benchmarks
- 🔗 OpenRouter Models Scraper — Full OpenRouter model catalog with pricing and capability metadata
- 🔗 Mistral Models Scraper — Mistral AI model catalog with pricing and context window data
- 🔗 Together AI Models Scraper — Together AI model catalog with pricing and availability
- 🔗 DeepInfra Models Scraper — DeepInfra model catalog with pricing and throughput data