# Indeed Hiring Intelligence (`ryanclinton/indeed-hiring-intelligence`) Actor

Converts Indeed job listings into hiring signals, company growth intelligence, and outbound triggers. Detects engineering-expansion, executive-hiring, and geo-expansion before the market notices. Country-verified, salary-parsed, PPE-billed per decision: not per scraped row.

- **URL**: https://apify.com/ryanclinton/indeed-hiring-intelligence.md
- **Developed by:** [Ryan Clinton](https://apify.com/ryanclinton) (community)
- **Categories:** Jobs, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Indeed Hiring Intelligence

![Indeed Hiring Intelligence — growth signals, not just jobs. $0.02 per signal, $0.05 per company](https://apifyforge.com/readme-assets/ryanclinton-indeed-hiring-intelligence/hero.png)

Turn Indeed job listings into growth signals, outbound triggers, and company intelligence. Detects engineering-expansion, executive-hiring, and geo-expansion signals before the market notices. Not an Indeed scraper — a hiring intelligence layer built on top of one.

Most Indeed actors extract jobs. This actor detects organizational growth.

### What are hiring signals?

Hiring signals are detectable patterns in company job postings that indicate:

- organizational expansion (engineering-expansion)
- GTM scaling (sales-expansion)
- geographic growth (geo-expansion)
- leadership restructuring (executive-hiring)
- platform-vendor evaluation cycles (engineering-expansion plus first SRE role)
- compliance buildout in legal, security, and privacy hiring (compliance-buildout)

This actor converts Indeed job listings into structured signals scored by confidence, freshness, and actionability window. Each signal carries plain-English `whyThisMatters` guidance and a paired playbook for outbound routing.

![Intelligence stack — 8-layer pipeline from Indeed listing to outbound playbook](https://apifyforge.com/readme-assets/ryanclinton-indeed-hiring-intelligence/intelligence-layers.png)

### How this actor differs from existing Indeed scrapers

| Existing Indeed actors | This actor |
|---|---|
| Extract jobs | Detect growth |
| Return rows | Return signals |
| Stateless runs | Temporal intelligence (firstSeen, decay, delta) |
| Best-effort scraping | Verified extraction integrity (TrustLayer) |
| Commodity data | Sales intelligence |
| Manual analysis required | Opinionated actions (recommendedAction + playbooks) |
| Charged per row | Charged per decision |

### Who uses this

- **SDR agencies** detect companies entering buying cycles before competitors notice
- **Recruiters** identify hard-to-fill roles via `repostDetected` and expanding teams via `hiringMomentum`
- **PE and VC firms** monitor portfolio-company growth and geographic expansion
- **HR tech vendors** consume the signal stream as buyer-intent data
- **Competitive intelligence teams** track per-competitor hiring momentum across scheduled runs
- **AI sales tooling** (Clay, Apollo, Common Room patterns) map signals and playbooks cleanly to enrichment columns

### Best for

- SDR agencies doing trigger-based prospecting on engineering-led companies
- recruiters targeting scaling engineering teams and hard-to-fill roles
- PE and VC firms monitoring portfolio-company growth and expansion vectors
- HR tech platforms consuming hiring-intent data as event streams
- AI sales tooling (Clay, Apollo, Common Room) using signals as enrichment columns
- competitive intelligence teams tracking per-competitor hiring momentum

### Use this actor when you need

- hiring intent signals from Indeed for B2B outbound
- company-level hiring aggregation, not per-job rows
- recruiter-pain detection via repost flags and days-open tracking
- outbound sales triggers from organizational expansion patterns
- engineering-expansion monitoring across a target list
- geo-expansion alerts when companies enter new countries
- hiring momentum scoring with confidence ranking
- labor market telemetry sourced from primary public data

### Questions this actor can answer

- Which companies are rapidly expanding engineering teams?
- Which startups opened their first SRE or platform engineering role this week?
- Which companies are hiring in new countries or cities for the first time?
- Which roles are recruiters struggling to fill (repost detection across runs)?
- Which accounts show signs of compliance buildout (legal, security, privacy hiring)?
- Which companies increased hiring velocity in the last 30 days?
- Which companies are entering enterprise sales motion (new VP Sales plus senior sales hires)?
- Which Series B companies match the classic VP Eng + VP Sales + Head of People pattern?
- Which companies should my SDRs call tomorrow morning?

### Compared to traditional Indeed scrapers

Traditional Indeed scrapers focus on:

- extracting raw job rows
- maximizing result count per query
- low-cost bulk collection
- per-row pricing

This actor focuses on:

- hiring signals derived from organizational patterns
- per-company aggregation with momentum scoring
- temporal hiring intelligence across runs (firstSeenAt, daysOpen, repostDetected)
- outbound workflows with playbook routing
- extraction confidence scoring (TrustLayer)
- recruiter-pain detection
- charged per detected decision, not per scraped row

![Sample output — company, signal type, severity, confidence, freshness, paired playbook](https://apifyforge.com/readme-assets/ryanclinton-indeed-hiring-intelligence/output-table.png)

### Hiring signals, company intelligence, and Indeed job data extracted

Four record types in one dataset, discriminated by `recordType`:

1. **Signal** — the product. Each signal is a detected pattern across a company's hiring. Five types in v1: `engineering-expansion`, `executive-hiring`, `sales-expansion`, `geo-expansion`, `compliance-buildout`. Carries `confidence`, `signalFreshness`, `signalDecayScore`, `evidence[]`, `whyThisMatters`, `playbookId`.

2. **Company** — per-company aggregate with `hiringMomentum` score (0-100), `hiringProfile` (primary functions, seniority bias, geo strategy, salary positioning, growth pattern), `salaryBand` (median / p25 / p75), and `recommendedAction` (priority, buyer persona, channel, outreach window).

3. **Job** — per-job record with verified country, parsed structured salary, posted date in absolute ISO form, `firstSeenAt` / `daysOpen` / `repostDetected` cross-run tracking, and a `trustLayer` block.

4. **Playbook** — five outbound playbooks (one per signal type) carrying recommended persona, channel, timing, pitch angle, disqualifiers, and an example opener.

#### Sample signal record

```json
{
  "recordType": "signal",
  "company": "Acme Corp",
  "signalType": "engineering-expansion",
  "severity": "high",
  "confidence": 0.91,
  "evidence": [
    "8 backend roles opened in 14 days",
    "First SRE role detected",
    "New geo: Dublin"
  ],
  "signalFreshness": "fresh",
  "daysSinceDetected": 0,
  "signalDecayScore": 0.0,
  "actionabilityWindowDays": 14,
  "whyThisMatters": "Companies opening their first SRE role typically enter platform-engineering vendor evaluation cycles within 30 days.",
  "playbookId": "devtools-expansion-vp-eng"
}
````

#### Sample company intelligence record

```json
{
  "recordType": "company",
  "company": "Acme Corp",
  "openRolesInRun": 12,
  "roleMix": { "engineering": 7, "sales": 2, "product": 1, "ops": 2 },
  "salaryBand": { "median": 75000, "p25": 62000, "p75": 95000, "currency": "GBP", "sampleSize": 8 },
  "hiringMomentum": {
    "tier": "P1-HIGH-GROWTH",
    "score": 78,
    "scoreVelocity": "+12",
    "signals": ["12 active postings", "5 new postings in last 7 days", "3 senior-plus engineering roles open simultaneously"]
  },
  "lifecycleStage": "scaling",
  "hiringProfile": {
    "primaryFunctions": ["engineering", "data"],
    "seniorityBias": "senior-heavy",
    "geoStrategy": "distributed-europe",
    "salaryPositioning": "premium",
    "growthPattern": "aggressive-expansion"
  },
  "recommendedAction": {
    "priority": "P1",
    "reason": "Rapid engineering expansion (5 new roles in 7 days) plus premium compensation bands.",
    "idealBuyerPersona": "VP Engineering",
    "recommendedChannel": "linkedin",
    "outreachWindowDays": 10
  }
}
```

![Charged per decision, detects growth not jobs, temporal intelligence, TrustLayer verified](https://apifyforge.com/readme-assets/ryanclinton-indeed-hiring-intelligence/feature-callouts.png)

### Reliability features

- Country-filter verification at extraction time
- Structured salary parsing with confidence scoring (12+ Indeed salary formats)
- Confidence-scored extraction per job (TrustLayer, 0 to 1)
- Hard runtime budget with Apify-deadline auto-clamp
- Partial-run truncation handling (`truncated` and `truncatedReason` on the SUMMARY record)
- Zero-result billing protection — empty queries are never charged
- Low-confidence noise gating — jobs below `extractionConfidence` 0.5 are never charged
- Stable cross-run state via KV store (firstSeenAt persistence per Indeed jobId)
- Cheerio crawler with automatic Playwright fallback on 403 or zero-card pages
- Residential proxy default for anti-bot evasion

### Extraction integrity: TrustLayer confidence scoring

Every job record carries an `extractionConfidence` 0-1 score and a `verification` flags object. The score is computed deterministically from six components:

- Required fields present (title + company + location + url): 0.30
- Salary parsed to structured form: 0.20
- Description length plausible: 0.15
- Country axis matches input filter: 0.20
- Posted date parsed to absolute ISO: 0.10
- Job ID stable: 0.05

Jobs scoring below 0.5 are not charged. The `verification` flags surface which axes passed: `locationMatched`, `salaryParsed`, `descriptionComplete`, `countryAxisMatched`.

Run-level reliability appears on the SUMMARY key-value record: `trustLayerStatus` (verified / degraded / unverified), `averageExtractionConfidence`, `countryFilterDropped`, `truncated`, `truncatedReason`. Downstream consumers can branch on `WHERE summary.truncated = true` to detect partial runs.

### Verified filters: country, date, and salary type

Three filters the dominant Indeed scraper does not offer:

- **Country verified at extraction time.** Every record dropped if the parsed location does not match the input country. Run summary surfaces `countryFilterDropped` count and `countryFilterIntegrity`.
- **Posted within N days.** Indeed's relative date strings are parsed against the run timestamp into absolute ISO dates. Records older than the window are dropped.
- **Salary type filter.** Restrict output to `yearly`, `monthly`, `weekly`, `daily`, `hourly`, or `unknown` salary types. Defaults to including all.

### Hard runtime budget — no runaway Indeed crawls

`maxRuntimeSeconds` is a hard kill switch. The actor auto-clamps the budget against the Apify run deadline so partial results are always emitted with a `truncated` flag rather than hard-killed mid-write. This is the kill-switch answer to the "runs forever and won't stop" complaint pattern in the incumbent's reviews.

### Use cases: lead generation, recruiting, and competitive intelligence

- **SDR agencies and AI sales tooling** (Clay, Apollo, Common Room patterns) — trigger-based prospecting via the signal record stream. Each signal carries an evidence array and a playbook ID for routing.
- **Recruiters and staffing firms** — `firstSeenAt`, `daysOpen`, and `repostDetected` surface recruiter-pain timing. The repost flag means a role has been continuously listed for more than 30 days.
- **PE and VC firms** — `hiringProfile.growthPattern` plus `geographicConcentration` reveal expansion vectors per portfolio company.
- **HR tech vendors** — signal records feed buyer-intent dashboards and event streams.
- **Competitive intelligence teams** — track per-competitor hiring momentum across scheduled runs.

### Pricing for hiring signals, company intelligence, and Indeed job extraction

Pay-per-event. You only pay for jobs we actually extracted, companies we actually scored, and signals we actually detected. Never for zero-result queries or low-confidence noise.

| Event | Price | When charged |
|---|---|---|
| `job-extracted` | $0.005 | Per job pushed; skipped when `trustLayer.extractionConfidence < 0.5` |
| `company-intelligence` | $0.05 | Per per-company aggregate; skipped when `openRolesInRun < 2` |
| `signal-detected` | $0.02 | Per signal emitted; skipped when `confidence < 0.6` or `signalDecayScore >= 1.0` |

Apify platform compute and proxy are billed separately by Apify.

The seven charge-gating rules are encoded in `src/pricing.ts`:

1. Never charge `job-extracted` for jobs with `extractionConfidence < 0.5`.
2. Never charge `job-extracted` for queries that returned zero records.
3. Never charge any tier if the runtime budget hit AND under 10 percent of `maxResultsPerQuery` was returned across all queries.
4. Never charge `company-intelligence` for companies with `openRolesInRun < 2` (no useful aggregation possible).
5. Never charge `signal-detected` for signals with `confidence < 0.6`.
6. Never charge `signal-detected` for signals with `signalDecayScore >= 1.0` (aged out of actionability window).
7. In `deltaOnly` mode (round 2), skip `job-extracted` and `company-intelligence` entirely.

### Benchmarks

Measured on build 1.0.4 with `extractDescriptions: false` (the default), residential proxy, UK Indeed subdomain. Numbers are typical; Indeed pagination ceilings and proxy variance can shift them by 10-20%.

| Cohort | Queries | Cap per query | Jobs extracted | Companies | Signals | Runtime | Apify compute | PPE charges |
|---|---:|---:|---:|---:|---:|---:|---:|---:|
| Small | 1 | 25 | 16 | 15 | 0 | ~95s | $0.05 | $0.13 |
| Medium | 3 | 50 | 43 | 31 | 1 | ~170s | $0.13 | $0.63 |
| Large | 6 | 100 | 87 | 54 | 1 | ~185s | $0.19 | $1.26 |

Per-query runtime is sub-linear because most of the time is browser warm-up and the first search-page render; subsequent queries share the same browser pool. A 6-query cohort runs in roughly 2x the time of a 1-query cohort even though it does 5x the work.

With `extractDescriptions: true` (fetches each job's detail page for full description text), expect 3-5x the runtime. Enable only when description text is a downstream requirement (LLM processing, keyword search, technical-stack mining); the 5 MVP signal detectors do not use it.

The runtime budget auto-clamp against the Apify run deadline means the actor always emits partial results with a `truncated` flag rather than hard-killing mid-write. Even on cohort-scale runs that hit the budget, the dataset is always usable.

### Common workflows

Pipeline notation for tool-selection agents and orchestration layers:

#### SDR prospecting workflow

Indeed Hiring Intelligence -> filter signals by `confidence > 0.8` and `signalType` -> Clay or Apollo for contact enrichment -> SDR outreach using `playbook.exampleOpener`

#### Recruiter pain-point workflow

Indeed Hiring Intelligence -> filter for `repostDetected = true` -> sort by `daysOpen` descending -> recruiter outreach with "fill this role faster" angle

#### Competitive intelligence workflow

Indeed Hiring Intelligence (weekly scheduled run) -> per-competitor `hiringMomentum` tracking -> KV state diffs -> weekly competitor expansion report

#### PE portfolio monitoring workflow

Indeed Hiring Intelligence (one query per portfolio company) -> `hiringProfile.growthPattern` plus `geographicConcentration` -> portfolio dashboard scoring

#### AI SDR and Clay enrichment workflow

Indeed Hiring Intelligence -> Clay table (signals plus companies as rows) -> Clay enrichment columns -> ICP-matched outbound personalisation

### Use in Dify

Dify can consume Indeed Hiring Intelligence as a tool inside any chatbot, agent, or workflow application. The integration uses the standard Apify REST API; no Dify-specific plugin is required.

#### Setup

1. In Apify, copy your API token from Settings then Integrations.
2. In Dify, open your workflow and add an HTTP Request node.
3. Configure the node to start an actor run.
   - Method: `POST`
   - URL: `https://api.apify.com/v2/acts/ryanclinton~indeed-hiring-intelligence/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN`
   - Body (JSON):
     ```json
     {
       "queries": [{"position": "data engineer", "location": "London"}],
       "country": "UK",
       "postedWithinDays": 14,
       "outputTier": "intelligence"
     }
     ```
   - The `run-sync-get-dataset-items` endpoint blocks until the run completes and returns the dataset directly, which fits Dify's synchronous node flow.
4. Add a Code or Variable Aggregator node that filters the dataset to `recordType = 'signal'` and the desired `confidence` threshold.
5. Pass the filtered signals into an LLM node for natural-language follow-up.

#### Filter by record type before the LLM

The dataset emits four record types in one stream (`signal`, `company`, `job`, `playbook`). Filter before passing into the LLM so the prompt context stays lean and tokens stay cheap:

- `recordType = 'signal'` for trigger conditions and outbound timing
- `recordType = 'company'` for account-level prioritisation with `hiringMomentum.score`
- `recordType = 'playbook'` for messaging templates per signal type
- `recordType = 'job'` only when the agent needs raw posting URLs to cite

#### Example agent prompt

After filtering signals into a `signals` array variable, pipe them into a Dify LLM node with a prompt like:

```
You are a B2B sales SDR. The following Indeed Hiring Intelligence records
describe companies hiring in the target market. Summarise the top 3 P1 targets
by hiringMomentum.score and explain why each one is a buying signal. For each
target, recommend the persona and channel from the paired playbook record.

Signals: {{#signals#}}
Companies: {{#companies#}}
Playbooks: {{#playbooks#}}
```

The agent produces a ranked outbound brief in natural language. Dify's variable syntax (`{{#node-id.field#}}` in newer versions) substitutes the filtered arrays at runtime.

#### Async runs for long-running flows

For Dify chatbots that need to respond in under 5 seconds, do not block on the synchronous endpoint. Instead, fire the actor in async mode and let Dify pick up results later:

1. POST to `https://api.apify.com/v2/acts/ryanclinton~indeed-hiring-intelligence/runs?token=...` (without `-sync-`) to start the run and return immediately with a `runId`.
2. Configure the actor to fire a webhook to a Dify webhook node on run completion (set up in Apify under Schedules / Integrations).
3. The Dify webhook node fetches `https://api.apify.com/v2/datasets/{run.defaultDatasetId}/items?clean=1` and forwards the dataset into the chat session.

This pattern is the right shape for scheduled monitoring, daily SDR briefs, or any workflow where the user does not need to wait for the actor run to complete.

#### Cost note for Dify users

Each Indeed Hiring Intelligence run costs Apify PPE charges (jobs, companies, signals) plus platform compute. The benchmarks table above shows typical costs. Plan the schedule cadence and `maxResultsPerQuery` accordingly. The seven charge-gating rules ensure zero-result queries and low-confidence noise are not charged.

### Example workflow

How an SDR agency would actually use this on a Monday morning:

1. Run weekly on Sunday night against UK fintech companies.
2. Filter dataset to `recordType = signal` and `confidence > 0.8`.
3. Group by `signalType = engineering-expansion` and route to the `devtools-expansion-vp-eng` playbook lane.
4. Send the top 20 companies (by `hiringMomentum.score`) into Clay or Apollo for contact enrichment.
5. Hand the enriched list to SDRs Monday morning. The `playbook.exampleOpener` is the first-message template; `playbook.recommendedTiming` is the SLA on outreach.

Or for a recruiter targeting Series B engineering teams:

1. Run daily against your target city.
2. Filter `recordType = signal` and `signalType = engineering-expansion`.
3. Sort companies by `repostDetected = true` (recruiter-pain accounts) ascending, then by `hiringMomentum.score` descending.
4. Top of the list is your call sheet for the week.

### Example input — scraping Indeed for hiring signals

```json
{
  "queries": [
    { "position": "data engineer", "location": "London" },
    { "position": "site reliability engineer", "location": "Manchester" }
  ],
  "country": "UK",
  "postedWithinDays": 14,
  "salaryTypes": [],
  "maxResultsPerQuery": 100,
  "maxRuntimeSeconds": 1800,
  "outputTier": "intelligence",
  "proxyConfiguration": { "useApifyProxy": true, "groups": ["RESIDENTIAL"] }
}
```

### Output tiers — from raw Indeed jobs to full sales intelligence

| Tier | Records emitted | Use case |
|---|---|---|
| `commodity` | Jobs only | Migration from existing Indeed scrapers, raw data dumps |
| `intelligence` | Jobs + companies + signals + playbooks (default) | Outbound prospecting, recruiter timing, account scoring |
| `watchtower` | Currently same as `intelligence`. Watchlists, delta-only mode, event stream, and triggers ship in round 2. | Future scheduled monitoring |

### Five hiring signal types detected from Indeed data

| signalType | What it means | Detection bar |
|---|---|---|
| `engineering-expansion` | Scaling engineering org | 5+ new engineering roles in 30 days, engineering share rising |
| `executive-hiring` | Leadership restructuring | Any new C-level, VP, Head-of role |
| `sales-expansion` | GTM scaling | 3+ new sales roles, or new VP Sales |
| `geo-expansion` | New country or city presence | First-seen postings in a country or city for that company |
| `compliance-buildout` | Legal / security / regulatory surge | 2+ legal, security, compliance, or privacy roles |

Each signal type carries a hard-coded `actionabilityWindowDays` value driving the decay model: 14 days default, 30 for executive hiring, 60 for compliance buildout. Signals aged out of their window are not billed.

Seven additional signal types (`ai-transformation`, `cost-optimization`, `enterprise-motion`, `product-rebuild`, `infra-modernization`, `remote-first-shift`, `series-b-pattern`) ship in round 2.

### This actor is NOT ideal for

- bulk archival scraping of millions of Indeed jobs at the lowest possible cost
- real-time contact enrichment (use `website-contact-scraper` or `waterfall-contact-enrichment`)
- phone or email discovery on individual people
- generic data warehousing of Indeed listings as cold storage
- jobs-only raw output with no intelligence layer (use `outputTier=commodity` if you must, but cheaper Indeed scrapers exist for raw rows)
- one-shot Indeed API replacement workflows where you only need raw rows

### What this Indeed actor does NOT do

- **Does not run real-time webhooks on signals.** Watchlists, trigger DSL, and event-stream firing ship in round 2.
- **Does not push tickets to Jira / Linear / GitHub.** Downstream consumers handle that. Each signal carries a `playbookId` for routing.
- **Does not source from Glassdoor, LinkedIn, Crunchbase, or Google Jobs.** This is Indeed-sourced. A multi-source orchestrator ships separately in round 3.
- **Does not compute predictive or ML-trained scores.** Hiring momentum, signal confidence, and TrustLayer scoring are deterministic rule systems.
- **Does not return contact emails or phone numbers.** Output is companies + signals + playbooks, not contacts. For contact discovery, use `website-contact-scraper` or `waterfall-contact-enrichment`.
- **Does not replace Apollo, Clearbit, or Common Room.** This is one signal source for those platforms, not a replacement.

### Compose with other job market actors

- `job-market-intelligence` for macro labor context
- `h1b-visa-intelligence` for US specialized hiring signals complementing engineering-expansion
- `company-deep-research` for full dossier on the P1 targets these signals surface

### Also useful for teams searching for

- Indeed jobs API alternatives
- Indeed integration for Dify
- Indeed integration for n8n
- Indeed integration for Make
- hiring intent data
- company growth signals
- sales triggers from job postings
- SDR prospecting signals
- recruiting intelligence API
- job posting analytics
- company hiring tracking
- account expansion detection
- labor market telemetry
- hiring momentum monitoring
- outbound signal enrichment
- Clay hiring signals integration
- Apollo hiring intent enrichment
- company growth detection
- engineering hiring momentum API
- SDR buying signals from Indeed
- B2B sales intelligence from public job data
- recruiter signal data
- hiring pattern detection

### Stable enum vocabulary and output schema

`recordType`: `job` / `company` / `signal` / `playbook` / `error`.
`signalType`: `engineering-expansion` / `executive-hiring` / `sales-expansion` / `geo-expansion` / `compliance-buildout`.
`severity`: `low` / `medium` / `high`.
`signalFreshness`: `fresh` (0-7 days) / `recent` (8-21) / `aging` (22-60) / `stale` (60+).
`hiringMomentum.tier`: `P1-HIGH-GROWTH` (75+) / `P2-EXPANDING` (55-74) / `P3-STEADY` (35-54) / `P4-QUIET` (under 35).
`recommendedAction.priority`: `P1` / `P2` / `P3` / `P4`.
`recommendedChannel`: `linkedin` / `email` / `phone` / `multi-channel`.
`failureType`: `invalid-input` / `parse-error`.
`trustLayerStatus`: `verified` / `degraded` / `unverified`.

These enums are additive and stable within v1. New values may be introduced; existing values will not be renamed or repurposed.

# Actor input Schema

## `queries` (type: `array`):

List of Indeed search queries. Each entry can be either a plain string ('data engineer London') or an object with explicit position and location: { "position": "data engineer", "location": "London" }. Country is taken from the top-level country field.

## `country` (type: `string`):

Two-letter country code. Selects the Indeed subdomain (e.g. UK -> uk.indeed.com, DE -> de.indeed.com). Every extracted job is verified against this country at parse time — records whose parsed location does not match are dropped and counted in countryFilterDropped.

## `postedWithinDays` (type: `integer`):

Drop jobs posted more than this many days ago. Indeed's relative date strings ('Today', '3 days ago', '30+ days ago') are parsed against the run timestamp. Set to 0 to disable.

## `salaryTypes` (type: `array`):

Only return jobs whose parsed salary matches one of these types. Jobs with unparseable salary are tagged 'unknown' and included only when 'unknown' is in this list. Leave empty to disable the filter entirely.

## `maxResultsPerQuery` (type: `integer`):

Hard cap on jobs extracted per query. Indeed paginates 15 jobs per page. Default 100 = roughly 7 pages per query.

## `maxRuntimeSeconds` (type: `integer`):

Hard runtime budget in seconds. Actor stops cleanly when reached and emits partial results with hitRuntimeBudget=true on the run summary, so you always get usable output rather than a hard-killed run.

## `outputTier` (type: `string`):

Preset bundle controlling which record types are emitted. 'commodity' = jobs only. 'intelligence' = jobs + companies + signals (recommended). 'watchtower' = currently treated identically to 'intelligence' (delta + event stream features ship in round 2).

## `includeJobs` (type: `boolean`):

Push individual job records to the dataset. Disable when you only want aggregated company intelligence and signal records.

## `includeCompanyIntelligence` (type: `boolean`):

Push per-company aggregate records with hiringMomentum score, hiringProfile, and recommendedAction.

## `includeSignals` (type: `boolean`):

Run signal detection (engineering-expansion, executive-hiring, sales-expansion, geo-expansion, compliance-buildout) across the run's company data and push signal records.

## `extractDescriptions` (type: `boolean`):

Fetch each job's detail page to extract the full description text. Off by default because the 5 MVP signal detectors run on title, company, location, salary, and posted-date (all card-level data), so detail-page fetches are pure runtime overhead for the intelligence pipeline. Enable only if you specifically need full description text for downstream LLM processing or keyword search. Enabling adds 3-5x to runtime.

## `proxyConfiguration` (type: `object`):

Apify proxy configuration. RESIDENTIAL group is the default and recommended setting; DATACENTER is used as a fallback when residential fails.

## Actor input object example

```json
{
  "queries": [
    {
      "position": "data engineer",
      "location": "London"
    },
    {
      "position": "site reliability engineer",
      "location": "Manchester"
    }
  ],
  "country": "UK",
  "postedWithinDays": 14,
  "salaryTypes": [],
  "maxResultsPerQuery": 100,
  "maxRuntimeSeconds": 1800,
  "outputTier": "intelligence",
  "includeJobs": true,
  "includeCompanyIntelligence": true,
  "includeSignals": true,
  "extractDescriptions": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "groups": [
      "RESIDENTIAL"
    ]
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "queries": [
        {
            "position": "data engineer",
            "location": "London"
        },
        {
            "position": "site reliability engineer",
            "location": "Manchester"
        }
    ],
    "proxyConfiguration": {
        "useApifyProxy": true,
        "groups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("ryanclinton/indeed-hiring-intelligence").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "queries": [
        {
            "position": "data engineer",
            "location": "London",
        },
        {
            "position": "site reliability engineer",
            "location": "Manchester",
        },
    ],
    "proxyConfiguration": {
        "useApifyProxy": True,
        "groups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("ryanclinton/indeed-hiring-intelligence").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "queries": [
    {
      "position": "data engineer",
      "location": "London"
    },
    {
      "position": "site reliability engineer",
      "location": "Manchester"
    }
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "groups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call ryanclinton/indeed-hiring-intelligence --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ryanclinton/indeed-hiring-intelligence",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Indeed Hiring Intelligence",
        "description": "Converts Indeed job listings into hiring signals, company growth intelligence, and outbound triggers. Detects engineering-expansion, executive-hiring, and geo-expansion before the market notices. Country-verified, salary-parsed, PPE-billed per decision: not per scraped row.",
        "version": "1.0",
        "x-build-id": "c2Jz1nDWTvrm6AY20"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ryanclinton~indeed-hiring-intelligence/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ryanclinton-indeed-hiring-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ryanclinton~indeed-hiring-intelligence/runs": {
            "post": {
                "operationId": "runs-sync-ryanclinton-indeed-hiring-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ryanclinton~indeed-hiring-intelligence/run-sync": {
            "post": {
                "operationId": "run-sync-ryanclinton-indeed-hiring-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "queries",
                    "country"
                ],
                "properties": {
                    "queries": {
                        "title": "Search queries",
                        "type": "array",
                        "description": "List of Indeed search queries. Each entry can be either a plain string ('data engineer London') or an object with explicit position and location: { \"position\": \"data engineer\", \"location\": \"London\" }. Country is taken from the top-level country field.",
                        "default": [
                            {
                                "position": "software engineer",
                                "location": "London"
                            }
                        ]
                    },
                    "country": {
                        "title": "Country",
                        "enum": [
                            "US",
                            "UK",
                            "CA",
                            "AU",
                            "DE",
                            "FR",
                            "IT",
                            "ES",
                            "NL",
                            "IE",
                            "IN",
                            "SG",
                            "JP",
                            "BR",
                            "MX",
                            "ZA",
                            "AE",
                            "PL",
                            "SE"
                        ],
                        "type": "string",
                        "description": "Two-letter country code. Selects the Indeed subdomain (e.g. UK -> uk.indeed.com, DE -> de.indeed.com). Every extracted job is verified against this country at parse time — records whose parsed location does not match are dropped and counted in countryFilterDropped.",
                        "default": "UK"
                    },
                    "postedWithinDays": {
                        "title": "Posted within (days)",
                        "minimum": 0,
                        "maximum": 90,
                        "type": "integer",
                        "description": "Drop jobs posted more than this many days ago. Indeed's relative date strings ('Today', '3 days ago', '30+ days ago') are parsed against the run timestamp. Set to 0 to disable.",
                        "default": 14
                    },
                    "salaryTypes": {
                        "title": "Salary types",
                        "type": "array",
                        "description": "Only return jobs whose parsed salary matches one of these types. Jobs with unparseable salary are tagged 'unknown' and included only when 'unknown' is in this list. Leave empty to disable the filter entirely.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResultsPerQuery": {
                        "title": "Max results per query",
                        "minimum": 1,
                        "maximum": 2000,
                        "type": "integer",
                        "description": "Hard cap on jobs extracted per query. Indeed paginates 15 jobs per page. Default 100 = roughly 7 pages per query.",
                        "default": 100
                    },
                    "maxRuntimeSeconds": {
                        "title": "Max runtime (seconds)",
                        "minimum": 60,
                        "maximum": 3600,
                        "type": "integer",
                        "description": "Hard runtime budget in seconds. Actor stops cleanly when reached and emits partial results with hitRuntimeBudget=true on the run summary, so you always get usable output rather than a hard-killed run.",
                        "default": 1800
                    },
                    "outputTier": {
                        "title": "Output tier",
                        "enum": [
                            "commodity",
                            "intelligence",
                            "watchtower"
                        ],
                        "type": "string",
                        "description": "Preset bundle controlling which record types are emitted. 'commodity' = jobs only. 'intelligence' = jobs + companies + signals (recommended). 'watchtower' = currently treated identically to 'intelligence' (delta + event stream features ship in round 2).",
                        "default": "intelligence"
                    },
                    "includeJobs": {
                        "title": "Include job records",
                        "type": "boolean",
                        "description": "Push individual job records to the dataset. Disable when you only want aggregated company intelligence and signal records.",
                        "default": true
                    },
                    "includeCompanyIntelligence": {
                        "title": "Include company intelligence",
                        "type": "boolean",
                        "description": "Push per-company aggregate records with hiringMomentum score, hiringProfile, and recommendedAction.",
                        "default": true
                    },
                    "includeSignals": {
                        "title": "Include signals",
                        "type": "boolean",
                        "description": "Run signal detection (engineering-expansion, executive-hiring, sales-expansion, geo-expansion, compliance-buildout) across the run's company data and push signal records.",
                        "default": true
                    },
                    "extractDescriptions": {
                        "title": "Extract job descriptions (slower)",
                        "type": "boolean",
                        "description": "Fetch each job's detail page to extract the full description text. Off by default because the 5 MVP signal detectors run on title, company, location, salary, and posted-date (all card-level data), so detail-page fetches are pure runtime overhead for the intelligence pipeline. Enable only if you specifically need full description text for downstream LLM processing or keyword search. Enabling adds 3-5x to runtime.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify proxy configuration. RESIDENTIAL group is the default and recommended setting; DATACENTER is used as a fallback when residential fails.",
                        "default": {
                            "useApifyProxy": true,
                            "groups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
