NVIDIA NGC Model Catalog Scraper
Pricing
Pay per event
NVIDIA NGC Model Catalog Scraper
Scrape 900+ GPU-optimized AI/ML models from the NVIDIA NGC catalog. Filter by keyword, application category, or framework. Returns model name, publisher, framework, precision, version, size, labels, and catalog URL.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
What does it do?
The NVIDIA NGC Model Catalog Scraper extracts structured data from NVIDIA's NGC catalog — the official repository of GPU-optimized AI/ML models published by NVIDIA and its partners. With 900+ pre-trained models across every major AI domain, the NGC catalog is the definitive source for production-ready deep learning models optimized for NVIDIA hardware.
This actor fetches every model's key metadata: name, publisher, application category, ML framework, precision type, model format, version, file size, labels, description, and catalog URL — all delivered as clean structured JSON, ready to integrate with your pipelines.
No API key or authentication required. The actor calls NVIDIA's public REST API directly.
Who is it for?
🔬 AI/ML Researchers who need to audit the NGC catalog for models in their domain (NLP, computer vision, speech, healthcare) or track when new models are published.
🏗️ MLOps Engineers who want to automate model discovery, maintain an internal registry of available NVIDIA models, or set up scheduled monitoring for new additions to the catalog.
📊 Data Scientists building model comparison dashboards, benchmarking frameworks, or exploring what pre-trained models are available for their use case before committing to training from scratch.
🧑💼 Product Managers & Technical Writers at AI companies who need up-to-date competitor model intelligence or want to document which NVIDIA models are available for their product.
🤖 AI Automation Engineers who want to feed the NGC catalog into AI agents, RAG pipelines, or knowledge bases that need to reason about available GPU-optimized models.
Why use it?
The NVIDIA NGC catalog doesn't offer an export feature. You can browse models in the web UI one by one, but there's no CSV download, no bulk API explorer, and no way to filter the full catalog programmatically without writing your own API client.
This actor handles pagination (38+ pages), client-side filtering by keyword, category, and framework, and normalizes the raw API response into clean, flat JSON suitable for spreadsheets, databases, or downstream AI pipelines — in under a minute.
What data does it extract?
| Field | Description | Example |
|---|---|---|
name | Model slug identifier | bertlargeuncased |
displayName | Human-readable model name | BERT Large Uncased |
publisher | Publisher organization | NVIDIA, Meta, MONAI |
orgName | NGC organization name | nvidia |
teamName | Team within the org | nemo, riva, tao |
application | Application category | Speech To Text, Classification |
framework | ML framework | PyTorch with NeMo, TensorRT |
precision | Model precision | FP32, FP16, AMP, OTHER |
modelFormat | Model format | SavedModel, TLT, RIVA, Bundle |
latestVersion | Latest version string | 1.0.0, deployable_v2.0 |
latestVersionSizeBytes | Model file size in bytes | 1248444838 |
latestVersionSizeMb | Model file size in MB | 1190.61 |
labels | Tags and keywords | ["NLP", "BERT", "PyTorch"] |
shortDescription | Brief model description | BERT Large Uncased trained on... |
isPublic | Whether the model is public | true |
canGuestDownload | Whether guests can download | true |
logoUrl | Logo image URL | https://... |
builtBy | Who built the model | aiapps, NVIDIA |
catalogUrl | Direct link to model page | https://catalog.ngc.nvidia.com/... |
createdDate | Model creation date (ISO 8601) | 2021-03-10T03:31:51.797Z |
updatedDate | Last update date (ISO 8601) | 2024-11-12T17:56:32.338Z |
How much does it cost to scrape the NVIDIA NGC catalog?
The actor uses pay-per-event pricing — you only pay for the models you actually extract. There's a small one-time start fee per run, plus a per-model charge.
Typical costs:
- 20 models (single keyword search): ~$0.025
- 100 models (one category): ~$0.11
- Full catalog (~926 models): ~$0.94
All models are retrieved via NVIDIA's public REST API — no browser, no proxy required. Runs complete in seconds to a few minutes depending on result count.
Free plan estimate
New Apify accounts include free monthly compute credits. At typical pricing, you can scrape hundreds of NGC models per month within the free tier.
How to use this actor
Step 1: Configure your search
Open the actor and fill in the Search keyword field (optional). For example, type bert to find all BERT-related models, or leave it blank to retrieve the full catalog.
Step 2: Apply category or framework filters (optional)
- Application category: filter to a specific domain like
Speech To Text,Classification,Object Detection, orHealthcare. - ML Framework: filter to a specific framework like
PyTorch,NeMo,TensorRT,MONAI, orTAO Toolkit.
Both filters are case-insensitive substring matches.
Step 3: Set your result limit
Set Max results to the number of models you want. Use a large number (e.g. 10000) to retrieve all matching models without a cap.
Step 4: Run and download
Click Start and wait for the run to complete (usually under 60 seconds). Download results as JSON, CSV, or Excel from the Dataset tab.
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
searchQuery | String | "" | Filter by keyword (searches name, display name, description) |
application | String | "" | Filter by application category (e.g. Classification, Speech To Text) |
framework | String | "" | Filter by ML framework (e.g. PyTorch, NeMo, TensorRT) |
maxResults | Integer | 100 | Maximum number of models to return |
maxRequestRetries | Integer | 3 | Retry attempts for failed API requests |
Output example
{"name": "bertlargeuncased","displayName": "Bertlargeuncased","publisher": "NVIDIA","orgName": "nvidia","teamName": "nemo","application": "OTHER","framework": "PyTorch with NeMo","precision": "AMP","modelFormat": "SavedModel","latestVersion": "1.0.0rc1","latestVersionSizeBytes": 1248444838,"latestVersionSizeMb": 1190.61,"labels": ["NLP", "Natural Language Processing", "BERT", "Bertlargeuncased"],"shortDescription": "BERT Large Uncased trained on English Wikipedia and BookCorpus","isPublic": true,"canGuestDownload": true,"logoUrl": "https://assets.nvidiagrid.net/ngc/logos/Nemo.png","builtBy": "","catalogUrl": "https://catalog.ngc.nvidia.com/orgs/nvidia/models/bertlargeuncased","createdDate": "2021-03-10T03:31:51.797Z","updatedDate": "2023-04-04T19:23:11.786Z"}
Tips & tricks
🔍 Combine filters for precision: Use searchQuery: "conformer" + framework: "NeMo" + application: "Speech" to narrow down to exactly the models you need.
📅 Monitor for new models: Schedule this actor to run weekly and compare the output against your previous snapshot. New models show up with a recent createdDate.
📊 Size-aware budgeting: Use latestVersionSizeMb to estimate download storage requirements before pulling models. A typical PyTorch model ranges from 50 MB to 10+ GB.
🏷️ Use labels for discovery: The labels field contains NVIDIA's own taxonomy. Search for "NSPECT" IDs to find models that have been inspected by NVIDIA's security team.
⚡ Fast runs with filters: Using keyword or category filters reduces both run time and cost since the actor stops paginating once it hits your maxResults limit.
Integrations
Export to Google Sheets for team collaboration
Run the actor → click Export to Google Sheets in the dataset view → share the sheet with your team. Ideal for ML teams maintaining a shared model registry.
Scheduled model monitoring with webhooks
Set up a weekly schedule → configure a webhook to POST results to Slack or email when the run completes. Your team gets notified when new NVIDIA models are available.
Feed into a RAG knowledge base
Use the Apify API to retrieve the dataset JSON → chunk model descriptions → embed with OpenAI → store in Pinecone or Weaviate. Your AI assistant can now answer "which NVIDIA NeMo models support speech synthesis in French?"
CI/CD model validation pipeline
Integrate with GitHub Actions: run the actor before deployment → verify your selected model ID still exists in the catalog → fail the pipeline if the model was deprecated.
API usage
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('automation-lab/nvidia-ngc-scraper').call({searchQuery: 'bert',framework: 'PyTorch',maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Found ${items.length} NVIDIA NGC models`);
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("automation-lab/nvidia-ngc-scraper").call(run_input={"searchQuery": "bert","framework": "PyTorch","maxResults": 50,})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())print(f"Found {len(items)} NVIDIA NGC models")
cURL
curl -X POST \"https://api.apify.com/v2/acts/automation-lab~nvidia-ngc-scraper/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"searchQuery": "bert","framework": "PyTorch","maxResults": 50}'
Using with AI assistants (MCP)
You can connect this actor to Claude, Cursor, VS Code, and other AI tools via the Apify MCP server. This lets your AI assistant query the NVIDIA NGC catalog on your behalf.
Claude Code (CLI)
$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/nvidia-ngc-scraper"
Claude Desktop / Cursor / VS Code
Add to your MCP config file (claude_desktop_config.json or equivalent):
{"mcpServers": {"apify": {"type": "http","url": "https://mcp.apify.com?tools=automation-lab/nvidia-ngc-scraper","headers": {"Authorization": "Bearer YOUR_API_TOKEN"}}}}
Example prompts for your AI assistant
- "Find all NVIDIA NGC models that use the NeMo framework for speech recognition"
- "List all classification models in the NGC catalog updated after 2024"
- "What NVIDIA models are available for object detection with FP16 precision?"
- "Show me the 10 largest NGC models by file size"
Legality
This actor accesses NVIDIA's publicly available NGC catalog API (api.ngc.nvidia.com/v2/models). All data extracted is publicly accessible without authentication. Use of the NGC catalog data is subject to NVIDIA's Terms of Service. This actor is not affiliated with or endorsed by NVIDIA Corporation.
Always ensure your use of the extracted data complies with applicable terms of service and data usage policies.
FAQ
Q: Does this actor require an NVIDIA API key? A: No. The NGC model catalog's list endpoint is publicly accessible without any authentication. The actor fetches data using NVIDIA's public REST API.
Q: How many models are available in the NGC catalog? A: At time of writing, there are 926+ models. The catalog grows regularly as NVIDIA and partners publish new models. The actor fetches a live count from the API and paginates through all results.
Q: Can I filter by publisher (e.g., only Meta or MONAI models)?
A: Currently, filtering is available by search keyword, application category, and ML framework. Publisher filtering can be applied by using a keyword that matches the publisher name (e.g., searchQuery: "meta" will find models published by Meta).
Q: The actor returned fewer results than expected. Why?
A: If you applied filters, the result count reflects how many models matched your filters — not the total catalog size. Try broadening your filters or removing them to retrieve more results. Also check that maxResults is set high enough.
Q: I'm getting errors on some pages. What should I do?
A: The actor automatically retries failed requests (default: 3 retries with backoff). If errors persist, try increasing maxRequestRetries to 5. Transient errors from the NVIDIA API are usually self-resolving within seconds.
Related scrapers
Explore more AI/ML data scrapers from automation-lab:
- Hugging Face Model Scraper — scrape models, datasets, and spaces from Hugging Face Hub
- arXiv Paper Scraper — extract research papers and abstracts from arXiv