# Arizona ROC Contractor License Scraper (`haketa/az-roc-contractor-license-scraper`) Actor

Scrape Arizona Registrar of Contractors (AZ ROC) license records. Search by license number, company name, qualifying party, city, license type and classification. Returns status, bond info, complaint history and full license details.

- **URL**: https://apify.com/haketa/az-roc-contractor-license-scraper.md
- **Developed by:** [Haketa](https://apify.com/haketa) (community)
- **Categories:** Developer tools, Automation, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $6.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🏗 Arizona ROC Contractor License Scraper

Scrape the **Arizona Registrar of Contractors (AZ ROC)** public license database with full pagination support. Search by license number, company name, qualifying party, city, license type, or classification. Returns complete license details including bond status, complaint history, and personnel information.

---

### Table of Contents

- [Overview](#overview)
- [What Data Is Returned](#what-data-is-returned)
- [Search Modes](#search-modes)
- [Full Pagination Support](#full-pagination-support)
- [Input Parameters](#input-parameters)
- [Output Schema](#output-schema)
- [Usage Examples](#usage-examples)
- [Proxy Recommendations](#proxy-recommendations)
- [Cost & Performance](#cost--performance)
- [Limitations](#limitations)
- [Technical Architecture](#technical-architecture)
- [Changelog](#changelog)

---

### Overview

The [AZ ROC public portal](https://azroc.my.site.com/AZRoc/s/contractor-search) is built on Salesforce Experience Cloud with Lightning Web Components (LWC). It returns server-rendered HTML with a paginated results table. This actor:

1. Navigates to the search page using a headless Chromium browser (Playwright).
2. Fills in the search term and optional Advanced Search filters.
3. **Automatically paginates through all result pages** until the last page is reached or the configured limit is hit.
4. For each result, optionally visits the individual contractor detail page to collect the full record.
5. Pushes all records to the Apify Dataset.

---

### What Data Is Returned

Each scraped record contains:

| Field | Description |
|---|---|
| `licenseNumber` | AZ ROC six-digit license number |
| `licenseType` | RESIDENTIAL, COMMERCIAL, DUAL, or Specialty Dual |
| `licenseStatus` | ACTIVE, SUSPENDED, EXPIRED, REVOKED, or CANCELLED |
| `businessName` | Registered business / company name |
| `dbaName` | Doing Business As name (if applicable) |
| `qualifyingParty` | Individual responsible for the trade qualifications |
| `personnel` | Array of all personnel with name and position |
| `entityType` | Legal entity type (e.g. Corporation, LLC) |
| `primaryClassification` | Primary classification code (e.g. `B-1`, `R-11`, `C-37`) |
| `classificationDesc` | Description of the primary classification |
| `classifications` | All classification codes and descriptions |
| `city` / `state` / `zip` | Business location |
| `phone` | Business phone number |
| `issuedDate` | License originally issued date (YYYY-MM-DD) |
| `renewedThroughDate` | License valid through date (YYYY-MM-DD) |
| `bondType` | Type of surety bond |
| `bondStatus` | ACTIVE or INACTIVE |
| `bondAmount` | Surety bond dollar amount |
| `bondCompany` | Surety bond company name |
| `bondNumber` | Bond policy / certificate number |
| `bondEffectiveDate` | Bond start date (YYYY-MM-DD) |
| `bondExpirationDate` | Bond expiration date (YYYY-MM-DD) |
| `openCases` | Number of currently open complaint cases |
| `disciplinedCases` | Number of disciplined complaint cases |
| `resolvedCases` | Number of resolved / settled cases |
| `complaintCount` | Total complaints (portal shows last 2 years) |
| `complaints` | Array of individual complaint records |
| `profileUrl` | Direct URL to the contractor's ROC detail page |
| `scrapedAt` | ISO 8601 timestamp of when the record was scraped |

---

### Search Modes

You can combine multiple search modes in a single run. Each mode enqueues an independent search job:

#### 1. License Number Lookup (fastest)
Direct lookup by AZ ROC license number. Leading zeros are added automatically (e.g. `12345` → `012345`).

```json
{
  "licenseNumbers": ["123456", "789012"]
}
````

#### 2. Company / Business Name Search

Partial name matching is supported.

```json
{
  "companyNames": ["Acme Plumbing", "Desert HVAC"]
}
```

#### 3. Qualifying Party Search

Search by the name of the individual responsible for the license.

```json
{
  "qualifyingPartyNames": ["John Smith", "Maria Garcia"]
}
```

#### 4. City Search

Return all contractors registered in a given Arizona city. Works best with additional filters. For large cities (Phoenix, Tucson), pagination will collect thousands of records.

```json
{
  "cities": ["Scottsdale", "Mesa"],
  "licenseStatus": "ACTIVE",
  "licenseType": "RESIDENTIAL"
}
```

***

### Full Pagination Support

**Version 1.1** adds complete, automatic pagination through all results pages.

#### How it works

The AZ ROC portal displays results in a table with `10`, `25`, or `50` rows per page and a **Next Page** / **Previous Page** button. Previous versions of the scraper only ever read the first page of results, which could mean missing dozens or hundreds of matching records.

The updated `performSearch()` function now:

1. Sets the items-per-page selector to `50` (configurable via `resultsPerPage`) immediately after the first results load.
2. Collects all rows from the current page.
3. Detects whether the **Next Page** button is present and enabled using multiple CSS selector fallbacks:
   - `button[title="Next Page"]`
   - `button[aria-label="Next Page"]`
   - `button.nextPage`
   - `a[title="Next Page"]`
   - `li.next button`
   - `button:has-text("Next")`
4. Before clicking Next, captures a **staleness marker** (the text content of the first row).
5. Clicks the Next button and polls the DOM until the first row changes, indicating the new page has rendered.
6. Repeats until either no Next button is found (last page) or `maxResultsPerSearch` is reached.

This approach is resilient to network delays and Salesforce's asynchronous re-rendering.

#### Stopping early

Set `maxResultsPerSearch` to a positive integer to cap the number of records collected per search query. Set to `0` (default) for unlimited — the scraper will traverse every page.

***

### Input Parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `licenseNumbers` | string\[] | `[]` | AZ ROC license numbers to look up |
| `companyNames` | string\[] | `[]` | Business names to search |
| `qualifyingPartyNames` | string\[] | `[]` | Qualifying party names to search |
| `cities` | string\[] | `[]` | Arizona city names to search |
| `licenseType` | string | `ALL` | Filter: `ALL`, `RESIDENTIAL`, `COMMERCIAL`, `DUAL` |
| `licenseStatus` | string | `ALL` | Filter: `ALL`, `ACTIVE`, `SUSPENDED`, `EXPIRED`, `REVOKED`, `CANCELLED` |
| `licenseClassification` | string | `""` | Filter by classification code (e.g. `B-1`, `C-37`) |
| `maxResultsPerSearch` | integer | `0` | Max records per search job (0 = unlimited, paginates all pages) |
| `resultsPerPage` | integer | `50` | Items per page: `10`, `25`, or `50` |
| `scrapeDetailPage` | boolean | `true` | Visit each contractor's detail page for full data |
| `scrapeComplaints` | boolean | `true` | Include complaint history (requires `scrapeDetailPage`) |
| `proxyConfiguration` | object | Apify Residential | Proxy settings |
| `maxConcurrency` | integer | `3` | Maximum parallel browser tabs (1–10) |

#### Example: Full city scrape with all filters

```json
{
  "cities": ["Phoenix"],
  "licenseType": "COMMERCIAL",
  "licenseStatus": "ACTIVE",
  "licenseClassification": "B-1",
  "maxResultsPerSearch": 0,
  "resultsPerPage": 50,
  "scrapeDetailPage": true,
  "scrapeComplaints": true,
  "maxConcurrency": 5,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}
```

#### Example: Batch license lookup

```json
{
  "licenseNumbers": ["100001", "200002", "300003"],
  "scrapeDetailPage": true,
  "scrapeComplaints": false,
  "maxConcurrency": 3
}
```

***

### Output Schema

Records are saved to the Apify Dataset. Each record is a flat JSON object. Arrays (`personnel`, `classifications`, `complaints`) are stored as JSON arrays.

#### Sample record

```json
{
  "licenseNumber": "123456",
  "licenseType": "Specialty Dual",
  "licenseStatus": "ACTIVE",
  "businessName": "Desert Star HVAC LLC",
  "dbaName": null,
  "qualifyingParty": "John Michael Smith",
  "personnel": [
    { "name": "John Michael Smith", "position": "Qualifying Party" },
    { "name": "Jane Smith", "position": "Member/Manager" }
  ],
  "entityType": "Limited Liability Company",
  "primaryClassification": "CR-39",
  "classificationDesc": "Air Conditioning and Refrigeration",
  "classifications": [
    { "code": "CR-39", "description": "Air Conditioning and Refrigeration" }
  ],
  "city": "Scottsdale",
  "state": "AZ",
  "zip": "85251",
  "phone": "(480) 555-0100",
  "issuedDate": "2010-03-15",
  "renewedThroughDate": "2026-03-31",
  "bondType": "Contractor License Bond",
  "bondStatus": "ACTIVE",
  "bondAmount": "$ 15,000",
  "bondCompany": "Western Surety Company",
  "bondNumber": "123456789",
  "bondEffectiveDate": "2024-04-01",
  "bondExpirationDate": "2026-03-31",
  "openCases": 0,
  "disciplinedCases": 0,
  "resolvedCases": 1,
  "complaintCount": 1,
  "complaints": [
    {
      "complaintId": "2023-00001",
      "type": "Workmanship",
      "outcome": "Resolved"
    }
  ],
  "profileUrl": "https://azroc.my.site.com/AZRoc/s/contractor-search?licenseId=a0o8y0000007D05AAE",
  "scrapedAt": "2024-11-15T14:32:01.000Z"
}
```

***

### Proxy Recommendations

The AZ ROC portal is hosted on Salesforce and uses Cloudflare / Akamai bot protection. To avoid detection and rate limiting:

- **Residential proxies** (Apify Residential) are strongly recommended for production runs.
- Datacenter proxies may trigger CAPTCHA challenges or return empty results.
- For development and testing, no proxy may work, but is not reliable for large runs.

Configure via the `proxyConfiguration` input parameter.

***

### Cost & Performance

| Scenario | Approx. records/hr | Notes |
|---|---|---|
| License number lookups | 300–500 | No pagination, direct detail page |
| Company name search | 200–400 | Small result sets, no pagination needed |
| City search (small city) | 150–300 | 1–3 pages typically |
| City search (Phoenix ACTIVE) | 80–150 | Many pages + detail pages |

Performance depends on proxy speed, portal response times, and `maxConcurrency`.

#### Cost tips

- Set `scrapeDetailPage: false` to collect list-only data — much faster and cheaper.
- Set `scrapeComplaints: false` to skip complaint parsing if not needed.
- Set `resultsPerPage: 50` (default) to minimize page loads.
- Use `maxResultsPerSearch` to cap large open-ended searches.

***

### Limitations

- **Complaint history**: The AZ ROC portal only displays complaints from the **prior two years**. Older complaints are not accessible via the public portal.
- **No official API**: The portal does not expose a public API. All data is parsed from server-rendered HTML and may break if Salesforce updates the DOM structure.
- **Pagination cap**: The portal may limit total results per search to 500–1000 records depending on the query. Use narrow filters (classification, city, status) to stay within limits.
- **Bot protection**: Salesforce bot protection may challenge scrapers. Residential proxies and human-like delays are used to mitigate this.

***

### Technical Architecture

```
Actor Input
    │
    ▼
buildJobs()
    │  Creates one job per search term / license number / city / qualifying party
    ▼
PlaywrightCrawler (Crawlee)
    │
    ├─► SEARCH requests (one per job)
    │       │
    │       ├─ Navigate to SEARCH_URL
    │       ├─ Fill search input
    │       ├─ Apply Advanced Search filters (if any)
    │       ├─ Set items-per-page to 50
    │       │
    │       └─► performSearch() — PAGINATION LOOP
    │               │
    │               ├─ parseResultsPage()  ←── Page 1
    │               ├─ clickNext() + waitForPageChange()
    │               ├─ parseResultsPage()  ←── Page 2
    │               ├─ clickNext() + waitForPageChange()
    │               ├─ ...
    │               └─ Return all collected rows
    │
    └─► DETAIL requests (one per unique licenseId)
            │
            ├─ Navigate to profileUrl
            ├─ parseDetailPage()
            │     ├─ CONTRACTOR section → businessName, phone, status
            │     ├─ LICENSE section    → type, classification, dates
            │     ├─ PERSONNEL section  → qualifying party, all personnel
            │     ├─ COMPLAINT section  → case counts + complaint IDs
            │     └─ BOND section       → bond status, amount, company
            └─ Actor.pushData(record)
```

#### Key design decisions

- **Staleness-based pagination**: Instead of relying on unreliable network events, the scraper captures the first row's text content before clicking Next and polls until it changes. This works reliably across Salesforce's asynchronous LWC re-rendering.
- **Multiple Next-button selectors**: The Salesforce portal's CSS classes vary between versions. Multiple fallback selectors ensure pagination detection remains robust.
- **Regex-based HTML parsing**: Section content is extracted with targeted regex patterns rather than a full DOM parser, which is faster and avoids issues with Salesforce's deeply nested shadow DOM.
- **Deduplication via `seen` Set**: Records are deduplicated by `licenseId` or `licenseNumber` across all search jobs to prevent duplicate dataset entries when multiple search terms match the same contractor.

***

### Changelog

#### v1.1.0

- ✅ **Full pagination support** — `performSearch()` now loops through all result pages automatically.
- ✅ Added `resultsPerPage` input parameter (10 / 25 / 50).
- ✅ `maxResultsPerSearch` default changed to `0` (unlimited).
- ✅ Staleness-based page-change detection for reliable Salesforce LWC pagination.
- ✅ Multiple CSS selector fallbacks for Next button detection.
- ✅ Bond expiration date now extracted from detail page.
- ✅ All classification codes extracted into `classifications` array.

#### v1.0.0

- Initial release with single-page search, detail page scraping, and bond/complaint parsing.

# Actor input Schema

## `licenseNumbers` (type: `array`):

One or more ROC license numbers to look up directly (e.g. '123456'). This is the fastest and most precise search method. Leading zeros are handled automatically.

## `companyNames` (type: `array`):

Business names to search. Partial matches are supported (e.g. 'Acme Plumbing'). Each name triggers a separate search and may return multiple results across multiple pages.

## `qualifyingPartyNames` (type: `array`):

Full name of the qualifying party (the individual responsible for the license). Format: 'First Last' (e.g. 'John Smith'). Partial names are accepted.

## `cities` (type: `array`):

Arizona city names to search by location (e.g. 'Phoenix', 'Tucson', 'Scottsdale'). Can be combined with license type or classification filters. For large cities pagination will be used to collect all results.

## `licenseType` (type: `string`):

Filter results by license type. 'ALL' returns all types. 'RESIDENTIAL' covers home construction/repair. 'COMMERCIAL' covers non-residential projects. 'DUAL' covers both.

## `licenseStatus` (type: `string`):

Filter by license status. 'ALL' returns every record regardless of standing. 'ACTIVE' returns only contractors currently authorized to work.

## `licenseClassification` (type: `string`):

Optional classification code to filter results (e.g. 'B-1', 'R-11', 'C-37'). Leave blank to include all classifications.

## `maxResultsPerSearch` (type: `integer`):

Maximum number of contractor records to return per individual search query (across all pages). Set to 0 for unlimited — the scraper will paginate through all available result pages.

## `resultsPerPage` (type: `integer`):

Number of results to request per page from the AZ ROC portal. Valid values are 10, 25, or 50. Higher values mean fewer page loads and faster runs.

## `scrapeDetailPage` (type: `boolean`):

When enabled, the scraper opens each contractor's detail page to collect complete information including bond details, all license classifications, complaint history and qualifying party details. Increases run time and cost but provides richer data.

## `scrapeComplaints` (type: `boolean`):

When enabled (requires Scrape Full Detail Page), fetches the complaint history for each contractor. Only complaints from the prior two years are shown on the AZ ROC portal.

## `proxyConfiguration` (type: `object`):

Proxy settings for browser requests to the AZ ROC portal. Apify residential proxies are recommended for production runs to avoid detection by Salesforce bot protection.

## `maxConcurrency` (type: `integer`):

Maximum number of browser tabs running in parallel. Higher values speed up bulk runs but increase memory and proxy usage. Recommended: 2-5 for residential proxies.

## Actor input object example

```json
{
  "licenseNumbers": [],
  "companyNames": [],
  "qualifyingPartyNames": [],
  "cities": [],
  "licenseType": "ALL",
  "licenseStatus": "ALL",
  "licenseClassification": "",
  "maxResultsPerSearch": 0,
  "resultsPerPage": 50,
  "scrapeDetailPage": true,
  "scrapeComplaints": true,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "maxConcurrency": 3
}
```

# Actor output Schema

## `licenseNumber` (type: `string`):

AZ ROC six-digit license number

## `licenseType` (type: `string`):

RESIDENTIAL, COMMERCIAL, DUAL, or Specialty Dual

## `licenseStatus` (type: `string`):

ACTIVE, SUSPENDED, EXPIRED, REVOKED, or CANCELLED

## `businessName` (type: `string`):

Registered business / company name

## `qualifyingParty` (type: `string`):

Individual responsible for holding the trade qualifications

## `dbaName` (type: `string`):

Doing business as name (if different from registered name)

## `classifications` (type: `string`):

All license classification codes and descriptions

## `primaryClassification` (type: `string`):

Primary classification code (e.g. B-1, R-11, C-37)

## `classificationDesc` (type: `string`):

Description of the primary classification

## `entityType` (type: `string`):

Legal entity type (e.g. Corporation, LLC, Sole Proprietor)

## `personnel` (type: `string`):

Array of personnel objects with name and position

## `city` (type: `string`):

Business city

## `state` (type: `string`):

Business state (typically AZ)

## `zip` (type: `string`):

Business ZIP code

## `phone` (type: `string`):

Business phone number as listed on ROC record

## `mailingAddress` (type: `string`):

Full mailing address

## `issuedDate` (type: `string`):

Date the license was originally issued (YYYY-MM-DD)

## `renewedThroughDate` (type: `string`):

Date the license is renewed / valid through (YYYY-MM-DD)

## `expirationDate` (type: `string`):

License expiration date (YYYY-MM-DD)

## `bondType` (type: `string`):

Type of surety bond (e.g. Contractor License Bond)

## `bondStatus` (type: `string`):

ACTIVE or INACTIVE — inactive bond means contractor cannot legally work

## `bondAmount` (type: `string`):

Surety bond dollar amount

## `bondCompany` (type: `string`):

Name of the surety bond company

## `bondNumber` (type: `string`):

Bond policy / certificate number

## `bondEffectiveDate` (type: `string`):

Bond effective start date (YYYY-MM-DD)

## `bondExpirationDate` (type: `string`):

Bond expiration date (YYYY-MM-DD)

## `openCases` (type: `string`):

Number of currently open complaint cases

## `disciplinedCases` (type: `string`):

Number of disciplined complaint cases

## `resolvedCases` (type: `string`):

Number of resolved / settled complaint cases

## `complaintCount` (type: `string`):

Total complaints on file (prior 2 years shown on portal)

## `complaints` (type: `string`):

Array of complaint objects with type, ID, outcome and status

## `profileUrl` (type: `string`):

Direct URL to the contractor's ROC detail page

## `scrapedAt` (type: `string`):

ISO 8601 timestamp of when the record was scraped

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "licenseNumbers": [],
    "companyNames": [],
    "qualifyingPartyNames": [],
    "cities": [],
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("haketa/az-roc-contractor-license-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "licenseNumbers": [],
    "companyNames": [],
    "qualifyingPartyNames": [],
    "cities": [],
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("haketa/az-roc-contractor-license-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "licenseNumbers": [],
  "companyNames": [],
  "qualifyingPartyNames": [],
  "cities": [],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call haketa/az-roc-contractor-license-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=haketa/az-roc-contractor-license-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Arizona ROC Contractor License Scraper",
        "description": "Scrape Arizona Registrar of Contractors (AZ ROC) license records. Search by license number, company name, qualifying party, city, license type and classification. Returns status, bond info, complaint history and full license details.",
        "version": "0.0",
        "x-build-id": "RU1rmFVZHGJ9P891X"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/haketa~az-roc-contractor-license-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-haketa-az-roc-contractor-license-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/haketa~az-roc-contractor-license-scraper/runs": {
            "post": {
                "operationId": "runs-sync-haketa-az-roc-contractor-license-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/haketa~az-roc-contractor-license-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-haketa-az-roc-contractor-license-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "licenseNumbers": {
                        "title": "License Numbers",
                        "type": "array",
                        "description": "One or more ROC license numbers to look up directly (e.g. '123456'). This is the fastest and most precise search method. Leading zeros are handled automatically.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "companyNames": {
                        "title": "Company / Business Names",
                        "type": "array",
                        "description": "Business names to search. Partial matches are supported (e.g. 'Acme Plumbing'). Each name triggers a separate search and may return multiple results across multiple pages.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "qualifyingPartyNames": {
                        "title": "Qualifying Party Names",
                        "type": "array",
                        "description": "Full name of the qualifying party (the individual responsible for the license). Format: 'First Last' (e.g. 'John Smith'). Partial names are accepted.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "cities": {
                        "title": "Cities",
                        "type": "array",
                        "description": "Arizona city names to search by location (e.g. 'Phoenix', 'Tucson', 'Scottsdale'). Can be combined with license type or classification filters. For large cities pagination will be used to collect all results.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "licenseType": {
                        "title": "License Type",
                        "enum": [
                            "ALL",
                            "RESIDENTIAL",
                            "COMMERCIAL",
                            "DUAL"
                        ],
                        "type": "string",
                        "description": "Filter results by license type. 'ALL' returns all types. 'RESIDENTIAL' covers home construction/repair. 'COMMERCIAL' covers non-residential projects. 'DUAL' covers both.",
                        "default": "ALL"
                    },
                    "licenseStatus": {
                        "title": "License Status Filter",
                        "enum": [
                            "ALL",
                            "ACTIVE",
                            "SUSPENDED",
                            "EXPIRED",
                            "REVOKED",
                            "CANCELLED"
                        ],
                        "type": "string",
                        "description": "Filter by license status. 'ALL' returns every record regardless of standing. 'ACTIVE' returns only contractors currently authorized to work.",
                        "default": "ALL"
                    },
                    "licenseClassification": {
                        "title": "License Classification",
                        "type": "string",
                        "description": "Optional classification code to filter results (e.g. 'B-1', 'R-11', 'C-37'). Leave blank to include all classifications.",
                        "default": ""
                    },
                    "maxResultsPerSearch": {
                        "title": "Max Results per Search Query",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of contractor records to return per individual search query (across all pages). Set to 0 for unlimited — the scraper will paginate through all available result pages.",
                        "default": 0
                    },
                    "resultsPerPage": {
                        "title": "Results Per Page",
                        "minimum": 10,
                        "maximum": 50,
                        "type": "integer",
                        "description": "Number of results to request per page from the AZ ROC portal. Valid values are 10, 25, or 50. Higher values mean fewer page loads and faster runs.",
                        "default": 50
                    },
                    "scrapeDetailPage": {
                        "title": "Scrape Full Detail Page",
                        "type": "boolean",
                        "description": "When enabled, the scraper opens each contractor's detail page to collect complete information including bond details, all license classifications, complaint history and qualifying party details. Increases run time and cost but provides richer data.",
                        "default": true
                    },
                    "scrapeComplaints": {
                        "title": "Include Complaint History",
                        "type": "boolean",
                        "description": "When enabled (requires Scrape Full Detail Page), fetches the complaint history for each contractor. Only complaints from the prior two years are shown on the AZ ROC portal.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings for browser requests to the AZ ROC portal. Apify residential proxies are recommended for production runs to avoid detection by Salesforce bot protection."
                    },
                    "maxConcurrency": {
                        "title": "Max Concurrent Browsers",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Maximum number of browser tabs running in parallel. Higher values speed up bulk runs but increase memory and proxy usage. Recommended: 2-5 for residential proxies.",
                        "default": 3
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
