# Caterpillar Catalog Scraper — CAT Equipment Specs (`rastriq/caterpillar-equipment-catalog-scraper`) Actor

Scrape the official Caterpillar equipment catalog. Extract model details, specifications, operating weight, engine data, dimensions, and performance specs for CAT excavators, dozers, loaders, and more. Build a comprehensive CAT equipment database.

- **URL**: https://apify.com/rastriq/caterpillar-equipment-catalog-scraper.md
- **Developed by:** [Rastriq — Structured data from the world](https://apify.com/rastriq) (community)
- **Categories:** Other, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### What data can you scrape from Caterpillar — CAT Equipment Specs?

- **Model identification** — official name, category, subcategory, product ID, source URL, product image
- **Full spec sheet** — engine power, operating weight, hydraulic flow, bucket capacity, dimensions, ground pressure, travel speed, service capacities, and every other OEM specification
- **Dual units** — values include both imperial and metric where CAT provides them (e.g., `270 hp 201 kW`)
- **Configuration variants** — when a model has GC, XE, or other variants, each gets its own spec set

---

### Output: Caterpillar — CAT Equipment Specs data structure

The Actor returns structured records with all available fields from Caterpillar — CAT Equipment Specs listings.

#### Example output (one record)

```json
{
  "modelName": "336 Excavator",
  "category": "excavators",
  "subcategory": "medium-excavators",
  "specSection": "Engine",
  "specName": "Net Power",
  "specValue": "270 hp 201 kW",
  "configName": "336 GC",
  "productId": "119820",
  "url": "https://www.cat.com/en_US/products/new/equipment/excavators/medium-excavators/119820.html",
  "description": "The Cat 336 delivers best-in-class...",
  "imageUrl": "https://www.cat.com/content/dam/cat/..."
}
````

***

### 🚀 Quick start

1. Click **Start** with the default input to test with a small sample.
2. Open the **Output** tab to preview results.
3. Export as CSV / Excel / JSON, or connect via API.

***

### How to scrape Caterpillar — CAT Equipment Specs — input options

| Field | Description | Default |
|-------|-------------|---------|
| **Equipment category** | Choose a specific equipment type, or leave on 'All' to get the full catalog. | — |
| **Max models** | Limit the number of equipment models returned. Set to 0 for all available (~332 for English catalog).💡 Use 5–10 for y... | `0` |
| **Catalog language** | Language and region of the CAT catalog. Different locales may have different models available. | `en_US` |
| **Force fresh scrape** | By default, results come from our **pre-built database** (updated monthly). Enable this only if you need the absolute... | false |
| **Parallel workers** | Concurrent HTTP requests when scraping fresh data. Higher = faster but may trigger rate limiting. | `10` |

***

### 🔍 How it works

This Actor uses **HTTP requests, BeautifulSoup HTML parser, schema.org JSON-LD extraction, sitemap-based discovery** to extract data from Caterpillar — CAT Equipment Specs. It navigates search results or catalog pages, extracts structured data from each listing, and normalizes the output into a consistent schema.

***

### How much does it cost to scrape Caterpillar — CAT Equipment Specs?

This Actor uses **Pay-Per-Event** pricing — you pay only for results delivered, not for compute time.

| Plan | What you get |
|------|-------------|
| **Free tier** | $5/month of platform credits — enough for thousands of results |
| **Paid plans** | Scale to tens of thousands of results per run |

***

### 🔄 Integrations & scheduling

- **Schedule** daily/weekly runs from the Apify Console for automated data collection.
- Push results to **Google Sheets, Slack, Zapier, Make, webhooks** or any database.
- Fetch datasets via the **Apify REST API** or the official JavaScript/Python clients.

***

### Is it legal to scrape Caterpillar — CAT Equipment Specs?

This Actor collects only **publicly available** data. It does not log in, bypass paywalls, or access private information. You are responsible for using the extracted data in compliance with the site's Terms of Service and applicable data protection laws (including GDPR where relevant).

***

### ❓ FAQ

**Can I access the data via API?**
Yes. Every run stores its dataset on Apify. Fetch it via REST API or use the official JavaScript/Python clients.

**What export formats are supported?**
JSON, CSV, Excel (XLSX), XML, and HTML table. You can also push data directly to Google Sheets or any webhook endpoint.

**Do I need proxies?**
Residential proxies are recommended for best results. The default proxy configuration is pre-set.

### Related Actors from Rastriq

- [Mascus Scraper](https://apify.com/rastriq/mascus-scraper)
- [Machinerytrader Scraper](https://apify.com/rastriq/machinerytrader-discovery)
- [Machineryzone Scraper](https://apify.com/rastriq/machineryzone-scraper)
- [Machineseeker Scraper](https://apify.com/rastriq/machineseeker-scraper)

# Actor input Schema

## `filterCategory` (type: `string`):

Choose a specific equipment type, or leave on 'All' to get the full catalog.

## `maxProducts` (type: `integer`):

Limit the number of equipment models returned. Set to 0 for all available (~332 for English catalog).<br><br>💡 Use 5–10 for your first test run.

## `locale` (type: `string`):

Language and region of the CAT catalog. Different locales may have different models available.

## `forceRefresh` (type: `boolean`):

By default, results come from our <b>pre-built database</b> (updated monthly). Enable this only if you need the absolute latest data — it will take longer.

## `maxWorkers` (type: `integer`):

Concurrent HTTP requests when scraping fresh data. Higher = faster but may trigger rate limiting.

## Actor input object example

```json
{
  "filterCategory": "",
  "maxProducts": 10,
  "locale": "en_US",
  "forceRefresh": false,
  "maxWorkers": 10
}
```

# Actor output Schema

## `equipmentSpecs` (type: `string`):

All scraped models with complete technical specifications. Each row is one spec: model, category, section, spec name, spec value.

## `errors` (type: `string`):

List of product pages that failed to scrape, if any.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxProducts": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("rastriq/caterpillar-equipment-catalog-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxProducts": 10 }

# Run the Actor and wait for it to finish
run = client.actor("rastriq/caterpillar-equipment-catalog-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxProducts": 10
}' |
apify call rastriq/caterpillar-equipment-catalog-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=rastriq/caterpillar-equipment-catalog-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Caterpillar Catalog Scraper — CAT Equipment Specs",
        "description": "Scrape the official Caterpillar equipment catalog. Extract model details, specifications, operating weight, engine data, dimensions, and performance specs for CAT excavators, dozers, loaders, and more. Build a comprehensive CAT equipment database.",
        "version": "0.1",
        "x-build-id": "FQNHfcCQf4Iw0CCqP"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/rastriq~caterpillar-equipment-catalog-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-rastriq-caterpillar-equipment-catalog-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/rastriq~caterpillar-equipment-catalog-scraper/runs": {
            "post": {
                "operationId": "runs-sync-rastriq-caterpillar-equipment-catalog-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/rastriq~caterpillar-equipment-catalog-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-rastriq-caterpillar-equipment-catalog-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "filterCategory": {
                        "title": "Equipment category",
                        "enum": [
                            "",
                            "articulated-trucks",
                            "asphalt-pavers",
                            "backhoe-loaders",
                            "cold-planers",
                            "compactors",
                            "dozers",
                            "drills",
                            "electric-rope-shovels",
                            "excavators",
                            "forest-machines",
                            "hydraulic-mining-shovels",
                            "landfill-compactors",
                            "large-mining-trucks",
                            "material-handlers",
                            "motor-graders",
                            "off-highway-trucks",
                            "pipelayers",
                            "road-reclaimers",
                            "scrapers",
                            "skid-steer-loaders",
                            "small-dozers",
                            "telehandlers",
                            "track-loaders",
                            "underground-hard-rock",
                            "underground-mining",
                            "utility-vehicles",
                            "wheel-dozers",
                            "wheel-excavators",
                            "wheel-loaders",
                            "wheel-tractor-scrapers"
                        ],
                        "type": "string",
                        "description": "Choose a specific equipment type, or leave on 'All' to get the full catalog.",
                        "default": ""
                    },
                    "maxProducts": {
                        "title": "Max models",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Limit the number of equipment models returned. Set to 0 for all available (~332 for English catalog).<br><br>💡 Use 5–10 for your first test run.",
                        "default": 0
                    },
                    "locale": {
                        "title": "Catalog language",
                        "enum": [
                            "en_US",
                            "es_ES",
                            "es_MX",
                            "es_LA",
                            "fr_FR",
                            "de_DE",
                            "it_IT",
                            "pt_BR",
                            "zh_CN",
                            "ja_JP",
                            "ko_KR",
                            "ru_RU",
                            "ar_SA",
                            "tr_TR",
                            "pl_PL",
                            "nl_NL",
                            "sv_SE",
                            "fi_FI",
                            "nb_NO",
                            "da_DK",
                            "cs_CZ",
                            "hu_HU",
                            "ro_RO",
                            "id_ID",
                            "th_TH",
                            "vi_VN"
                        ],
                        "type": "string",
                        "description": "Language and region of the CAT catalog. Different locales may have different models available.",
                        "default": "en_US"
                    },
                    "forceRefresh": {
                        "title": "Force fresh scrape",
                        "type": "boolean",
                        "description": "By default, results come from our <b>pre-built database</b> (updated monthly). Enable this only if you need the absolute latest data — it will take longer.",
                        "default": false
                    },
                    "maxWorkers": {
                        "title": "Parallel workers",
                        "minimum": 1,
                        "maximum": 30,
                        "type": "integer",
                        "description": "Concurrent HTTP requests when scraping fresh data. Higher = faster but may trigger rate limiting.",
                        "default": 10
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
