# Tech Stack From Job Posts Scraper (`coregent/tech-stack-from-job-posts-scraper`) Actor

Extract public job posts from Greenhouse, Lever, Ashby, and public career pages and detect the technologies, tools, and cloud platforms companies are hiring for - no login or cookies.

- **URL**: https://apify.com/coregent/tech-stack-from-job-posts-scraper.md
- **Developed by:** [Delowar Munna](https://apify.com/coregent) (community)
- **Categories:** Jobs, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.80 / 1,000 job-results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Tech Stack From Job Posts Scraper

![Tech Stack From Job Posts Scraper](https://raw.githubusercontent.com/coregentdevspace/tech-stack-from-job-posts-scraper-assets/main/thumbnail-tech-stack-from-job-posts-scraper.png)

Turn **public job posts into technology-demand data**. Point this actor at a **Greenhouse**, **Lever**, or **Ashby** board (or a public career page) and it returns clean, flat, CSV-ready rows — each enriched with the **technologies, tools, cloud platforms, frameworks, databases, and AI/ML stacks** the company is hiring for, plus a transparent `tech_signal_score`.

**No login, no cookies, no session tokens.** The actor reads each ATS's **public JSON API** over HTTP, so it stays fast and cost-predictable. You pay one flat event per unique job row that passes your filters.

### ✨ Why this scraper

- **Hiring-intent, not just job rows** — generic job scrapers return listings; this turns each description into structured technology signals for B2B sales, GTM research, and recruiting.
- **JSON-first, no browser** — reads Greenhouse / Lever / Ashby public APIs directly. One request per board returns every posting _with its description_.
- **Transparent detection** — a local, in-code keyword dictionary (no AI, no paid API). You can see and extend exactly what is matched.
- **Flat 29-field output** — no nested objects; drops straight into Sheets, Excel, or a CRM.
- **Pay-Per-Event** — one flat `job-result` event per saved unique job. Duplicates and filtered rows are never charged.

---

### 🚀 Quick start — sample inputs

#### Example 1 — ATS boards + technology filter

```json
{
    "startUrls": ["https://jobs.lever.co/example-company", "https://boards.greenhouse.io/examplecompany"],
    "maxResults": 200,
    "technologyKeywords": ["Snowflake", "dbt", "Kubernetes"],
    "requireTechnologyMatch": true,
    "technologyCategories": ["cloud", "data", "devops"],
    "keywordFilter": "data engineer",
    "locationFilter": "Australia",
    "remoteFilter": "any",
    "minTechSignalScore": 30,
    "includeDescriptionText": true,
    "dedupe": true,
    "proxyConfiguration": { "useApifyProxy": true }
}
````

#### Example 2 — single Ashby board, all jobs, custom residential proxy via your own provider

```json
{
    "startUrls": ["https://jobs.ashbyhq.com/example"],
    "maxResults": 500,
    "includeDescriptionText": true,
    "dedupe": true,
    "proxyConfiguration": {
        "useApifyProxy": false,
        "proxyUrls": ["http://user:pass@proxy.iproyal.com:12321"]
    }
}
```

> Supported ATS URLs use each platform's public JSON API. Other career pages are read via `schema.org/JobPosting` structured data when present, or by following an embedded Greenhouse/Lever/Ashby board. Pages with neither are reported as `unsupported_inputs`.

> Apify Residential proxy is blocked; if you need residential routing, supply your own provider via `proxyConfiguration.proxyUrls`. See **🚦 Proxy policy** below.

***

### 📦 Output

The dataset has one view: **Jobs & detected tech stack** — a 29-column flat table.

![Tech Stack From Job Posts Scraper — Jobs & detected tech stack table view](https://raw.githubusercontent.com/coregentdevspace/tech-stack-from-job-posts-scraper-assets/main/tech-stack-from-job-posts-scraper-output-jobs-n-detected-tech-stack-table-view.png)

#### Output fields (29)

`source_input`, `source_platform`, `company_name`, `company_domain`, `job_id`, `job_title`, `department`, `location`, `remote_type`, `employment_type`, `posted_at`, `job_url`, `apply_url`, `description_text`, `detected_technologies`, `technology_categories`, `languages`, `frameworks`, `cloud_platforms`, `databases`, `devops_tools`, `data_ai_tools`, `business_tools`, `tech_signal_score`, `tech_signal_label`, `reason_tags`, `matched_user_keywords`, `raw_description_length`, `scraped_at`.

#### Sample record — Jobs & detected tech stack

(Real run output; `description_text` is truncated here for readability.)

```json
{
    "source_input": "https://jobs.ashbyhq.com/ramp",
    "source_platform": "ashby",
    "company_name": "Ramp",
    "company_domain": "ramp.com",
    "job_id": "41696f51-7b29-4e12-b528-46c2f6c4f5f7",
    "job_title": "Senior Data Scientist, Growth",
    "department": "Data",
    "location": "New York, NY (HQ)",
    "remote_type": "remote",
    "employment_type": "Full-time",
    "posted_at": "2026-02-02T14:46:45.488Z",
    "job_url": "https://jobs.ashbyhq.com/ramp/41696f51-7b29-4e12-b528-46c2f6c4f5f7",
    "apply_url": "https://jobs.ashbyhq.com/ramp/41696f51-7b29-4e12-b528-46c2f6c4f5f7/application",
    "description_text": "About Ramp Ramp is building the smart infrastructure for finance teams, embedded in the transaction flow of every dollar a business spends. We automate how over...",
    "detected_technologies": "Airflow, BigQuery, Dagster, dbt, Fivetran, Git, Looker, NumPy, Pandas, Prefect, Python, Redshift, scikit-learn, Snowflake, SQL",
    "technology_categories": "ai_ml, analytics, data, database, language, other",
    "languages": "Python, SQL",
    "frameworks": "",
    "cloud_platforms": "",
    "databases": "BigQuery, Redshift, Snowflake",
    "devops_tools": "",
    "data_ai_tools": "Airflow, Dagster, dbt, Fivetran, NumPy, Pandas, Prefect, scikit-learn",
    "business_tools": "Looker",
    "tech_signal_score": 90,
    "tech_signal_label": "high",
    "reason_tags": "multiple_technologies, data_stack, ai_ml_stack, crm_or_marketing_stack, user_keyword_match, engineering_role",
    "matched_user_keywords": "dbt, Snowflake",
    "raw_description_length": 6118,
    "scraped_at": "2026-06-05T00:50:33.198Z"
}
```

***

### 🎯 Tech signal score

Transparent rule-based score (0–100) computed from the detected technologies and the role title — no AI, no external enrichment.

| Signal                                                  |             Points |
| ------------------------------------------------------- | -----------------: |
| Each unique detected technology                         | +10 (capped at 50) |
| At least two technology categories detected             |                +10 |
| Cloud platform or DevOps tool detected                  |                +10 |
| Database / data / AI-ML tool detected                   |                +10 |
| Engineering / data / IT / security / product role title |                +10 |
| A user-supplied `technologyKeywords` term matched       |                +10 |

Score is capped at 100. **Labels**: `high` (60–100) · `medium` (30–59) · `low` (1–29) · `none` (0).

`reason_tags` explains the score — e.g. `multiple_technologies`, `cloud_mentioned`, `devops_stack`, `data_stack`, `ai_ml_stack`, `crm_or_marketing_stack`, `user_keyword_match`, `engineering_role`.

***

### ⚙️ Filters

| Filter                   | Effect                                                                               |
| ------------------------ | ------------------------------------------------------------------------------------ |
| `keywordFilter`          | Case-insensitive substring on title + department + description.                      |
| `requireTechnologyMatch` | Keep only rows with at least one detected technology.                                |
| `technologyCategories`   | Keep rows with at least one detected technology in the selected categories.          |
| `locationFilter`         | Case-insensitive substring on location; jobs with no location are excluded when set. |
| `remoteFilter`           | `any` / `remote` / `hybrid` / `onsite` against the derived `remote_type`.            |
| `minTechSignalScore`     | Keep rows with `tech_signal_score` ≥ threshold.                                      |
| `dedupe`                 | Drop duplicates by platform job ID, canonical job URL, and title/company/location.   |

Filters are applied **before** any dataset push or event charge.

***

### 💰 Pricing

**Pay-Per-Event**. One flat event per saved row (the per-event price is configured on the Apify console):

| Event        | Charged when                                                                                 |
| ------------ | -------------------------------------------------------------------------------------------- |
| `job-result` | Once per unique job row that passed all filters and was successfully written to the dataset. |

Your bill is simply `results_saved × price_per_event`. The actor honors the user-configured per-run spending cap (Apify `eventChargeLimitReached`): it caps how many results it collects up-front to what the limit can pay for, and stops cleanly the moment the cap is reached during charging.

Not charged: duplicates, rows filtered out, rows missing a title/durable identifier, and failed or blocked requests.

#### 🚦 Proxy policy

Use **Apify Datacenter** proxy or **no proxy** for normal runs — both work reliably for the public ATS JSON APIs at this actor's conservative concurrency.

**Apify Residential proxy is not supported.** The actor will fail at startup if `proxyConfiguration.apifyProxyGroups` includes `RESIDENTIAL`. Reason: in pay-per-event actors, residential bandwidth (~/GB) is billed to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result event revenue.

If you genuinely need residential routing, supply your own residential provider via the proxy editor's **Custom proxy URLs** field — that traffic goes through your provider, not Apify, and is unaffected:

```
http://user:pass@proxy.iproyal.com:12321
http://user:pass@proxy.brightdata.com:22225
http://user:pass@proxy.oxylabs.io:7777
```

***

### 📊 Run summary

After each run, a `RUN_SUMMARY` entry is written to the key-value store:

```json
{
    "inputs_total": 3,
    "successful_inputs": 3,
    "failed_inputs": 0,
    "unsupported_inputs": 0,
    "raw_results_found": 420,
    "results_saved": 200,
    "duplicates_removed": 12,
    "filtered_out": 208,
    "charged_events": 200,
    "blocked_requests": 0,
    "retry_count": 1,
    "source_platform_counts": { "greenhouse": 120, "lever": 50, "ashby": 30 },
    "technology_counts": { "Python": 88, "AWS": 61, "Kubernetes": 44 },
    "category_counts": { "language": 180, "cloud": 110, "devops": 70 },
    "runtime_seconds": 41,
    "scraped_at": "2026-06-04T06:00:00.000Z"
}
```

`charged_events` equals the number of successfully saved unique rows.

***

### 🚧 Limitations (V1)

- **Public data only**: no login, cookies, or member-only content. The actor reads each ATS's public JSON board API.
- **Supported sources**: Greenhouse, Lever, and Ashby via public APIs; other career pages only when they expose `schema.org/JobPosting` JSON-LD or embed one of those boards.
- **Technology detection** uses a transparent local dictionary — it detects common languages, frameworks, cloud platforms, databases, DevOps, data/AI, analytics, CRM, security, and mobile tooling, not every niche tool.
- **`company_domain`** is best-effort and is often `null` for ATS-hosted boards (no company website is exposed).
- **No** recruiter/contact extraction, email enrichment, company-site crawling, or AI scoring.

***

### ❓ FAQ

**Do I need an account or cookies?** No. The actor only uses public ATS JSON endpoints.

**Which ATS platforms are supported?** Greenhouse, Lever, and Ashby via their public APIs. Generic career pages work when they expose JSON-LD `JobPosting` data or embed one of those boards.

**How are technologies detected?** A local, transparent keyword dictionary with aliases and word-boundary rules runs against the job title, department, and description. Add your own terms with `technologyKeywords`.

**Can I export to CSV?** Yes — every field is flat (no nested objects). Use Apify's CSV / Excel export, or call the dataset API with `format=csv`.

***

### 🛠️ Technical notes

- **Stack**: Node.js 22 · Apify SDK 3 · Crawlee `CheerioCrawler` (HTTP + JSON) · native fetch. No browser.
- **Endpoints**: Greenhouse `boards-api`, Lever `v0/postings`, Ashby `posting-api/job-board` (all public, no auth).
- **Concurrency**: `min=1`, `max=5` (conservative; tune after real runs).
- **Memory**: 1 GB min · 2 GB default · 4 GB max.
- **Proxy**: Apify Proxy enabled by default; custom configs accepted; Apify Residential rejected at startup.

# Actor input Schema

## `startUrls` (type: `array`):

Public job, company career, or ATS board URLs to process (Greenhouse, Lever, Ashby, or a public career page). Example: https://jobs.lever.co/example or https://boards.greenhouse.io/example. No login, cookies, or tokens.

## `maxResults` (type: `integer`):

Maximum number of saved unique job rows across the whole run (not per URL). Range 1-10000.

## `technologyKeywords` (type: `array`):

Optional extra technology terms to detect (and optionally require). Added to the built-in dictionary. Example: Snowflake, dbt, Kubernetes. Max 100 terms.

## `requireTechnologyMatch` (type: `boolean`):

Keep only jobs where at least one technology (built-in or user-supplied) was detected. Rows with no detected technology are filtered out and not charged.

## `technologyCategories` (type: `array`):

Keep only jobs with at least one detected technology in these categories. Leave empty for all.

## `keywordFilter` (type: `string`):

Optional text filter applied to job title + department + description (case-insensitive). Leave empty for no keyword filter. Max 200 chars.

## `locationFilter` (type: `string`):

Optional location filter applied to the job's location text (case-insensitive). Jobs with no location are excluded when this is set. Max 100 chars.

## `remoteFilter` (type: `string`):

Keep only jobs of this work mode (derived from title/location/description). 'Any' keeps all.

## `minTechSignalScore` (type: `integer`):

Keep only jobs whose tech\_signal\_score is greater than or equal to this threshold (0-100). 0 keeps all.

## `includeDescriptionText` (type: `boolean`):

Save the full visible job description in description\_text. Detection always uses the description internally; turning this off only reduces dataset size.

## `dedupe` (type: `boolean`):

Remove duplicate job posts by platform job ID, canonical job URL, and title/company/location so you are not charged for duplicates.

## `proxyConfiguration` (type: `object`):

Apify Proxy configuration. Defaults to Apify Proxy enabled. Apify Residential is NOT supported and will fail the run at startup; if you need residential routing, supply your own provider via Custom proxy URLs (proxyUrls).

## Actor input object example

```json
{
  "startUrls": [
    "https://jobs.lever.co/mistral",
    "https://boards.greenhouse.io/gitlab",
    "https://jobs.ashbyhq.com/ramp"
  ],
  "maxResults": 100,
  "technologyKeywords": [
    "Snowflake",
    "dbt",
    "Kubernetes"
  ],
  "requireTechnologyMatch": false,
  "technologyCategories": [],
  "keywordFilter": "",
  "locationFilter": "",
  "remoteFilter": "any",
  "minTechSignalScore": 0,
  "includeDescriptionText": true,
  "dedupe": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

Flat 29-field table of every job row pushed to the dataset, including provenance, company/job fields, the detected technologies and category columns, and the tech-signal score.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://jobs.lever.co/mistral",
        "https://boards.greenhouse.io/gitlab",
        "https://jobs.ashbyhq.com/ramp"
    ],
    "technologyKeywords": [
        "Snowflake",
        "dbt",
        "Kubernetes"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("coregent/tech-stack-from-job-posts-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [
        "https://jobs.lever.co/mistral",
        "https://boards.greenhouse.io/gitlab",
        "https://jobs.ashbyhq.com/ramp",
    ],
    "technologyKeywords": [
        "Snowflake",
        "dbt",
        "Kubernetes",
    ],
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("coregent/tech-stack-from-job-posts-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://jobs.lever.co/mistral",
    "https://boards.greenhouse.io/gitlab",
    "https://jobs.ashbyhq.com/ramp"
  ],
  "technologyKeywords": [
    "Snowflake",
    "dbt",
    "Kubernetes"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call coregent/tech-stack-from-job-posts-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=coregent/tech-stack-from-job-posts-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Tech Stack From Job Posts Scraper",
        "description": "Extract public job posts from Greenhouse, Lever, Ashby, and public career pages and detect the technologies, tools, and cloud platforms companies are hiring for - no login or cookies.",
        "version": "1.0",
        "x-build-id": "705MSbe1HT0yslclx"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/coregent~tech-stack-from-job-posts-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-coregent-tech-stack-from-job-posts-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/coregent~tech-stack-from-job-posts-scraper/runs": {
            "post": {
                "operationId": "runs-sync-coregent-tech-stack-from-job-posts-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/coregent~tech-stack-from-job-posts-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-coregent-tech-stack-from-job-posts-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Public job, company career, or ATS board URLs to process (Greenhouse, Lever, Ashby, or a public career page). Example: https://jobs.lever.co/example or https://boards.greenhouse.io/example. No login, cookies, or tokens.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResults": {
                        "title": "Max results",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of saved unique job rows across the whole run (not per URL). Range 1-10000.",
                        "default": 100
                    },
                    "technologyKeywords": {
                        "title": "Technology keywords",
                        "type": "array",
                        "description": "Optional extra technology terms to detect (and optionally require). Added to the built-in dictionary. Example: Snowflake, dbt, Kubernetes. Max 100 terms.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "requireTechnologyMatch": {
                        "title": "Require a technology match",
                        "type": "boolean",
                        "description": "Keep only jobs where at least one technology (built-in or user-supplied) was detected. Rows with no detected technology are filtered out and not charged.",
                        "default": false
                    },
                    "technologyCategories": {
                        "title": "Technology categories",
                        "type": "array",
                        "description": "Keep only jobs with at least one detected technology in these categories. Leave empty for all.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "language",
                                "framework",
                                "cloud",
                                "database",
                                "data",
                                "devops",
                                "ai_ml",
                                "analytics",
                                "crm",
                                "security",
                                "mobile",
                                "other"
                            ],
                            "enumTitles": [
                                "Language",
                                "Framework",
                                "Cloud",
                                "Database",
                                "Data",
                                "DevOps",
                                "AI / ML",
                                "Analytics",
                                "CRM",
                                "Security",
                                "Mobile",
                                "Other"
                            ]
                        },
                        "default": []
                    },
                    "keywordFilter": {
                        "title": "Keyword filter",
                        "type": "string",
                        "description": "Optional text filter applied to job title + department + description (case-insensitive). Leave empty for no keyword filter. Max 200 chars.",
                        "default": ""
                    },
                    "locationFilter": {
                        "title": "Location filter",
                        "type": "string",
                        "description": "Optional location filter applied to the job's location text (case-insensitive). Jobs with no location are excluded when this is set. Max 100 chars.",
                        "default": ""
                    },
                    "remoteFilter": {
                        "title": "Remote type filter",
                        "enum": [
                            "any",
                            "remote",
                            "hybrid",
                            "onsite"
                        ],
                        "type": "string",
                        "description": "Keep only jobs of this work mode (derived from title/location/description). 'Any' keeps all.",
                        "default": "any"
                    },
                    "minTechSignalScore": {
                        "title": "Minimum tech signal score",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Keep only jobs whose tech_signal_score is greater than or equal to this threshold (0-100). 0 keeps all.",
                        "default": 0
                    },
                    "includeDescriptionText": {
                        "title": "Include description text",
                        "type": "boolean",
                        "description": "Save the full visible job description in description_text. Detection always uses the description internally; turning this off only reduces dataset size.",
                        "default": true
                    },
                    "dedupe": {
                        "title": "Deduplicate jobs",
                        "type": "boolean",
                        "description": "Remove duplicate job posts by platform job ID, canonical job URL, and title/company/location so you are not charged for duplicates.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify Proxy configuration. Defaults to Apify Proxy enabled. Apify Residential is NOT supported and will fail the run at startup; if you need residential routing, supply your own provider via Custom proxy URLs (proxyUrls).",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
