Pricing

from $500.00 / 1,000 report generateds

Job Market Intelligence

Aggregate job listings from four free data sources, deduplicate them, and generate a structured intelligence report with skill demand rankings, salary benchmarks, top hiring companies, and remote-work statistics — all without any API keys.

Pricing

from $500.00 / 1,000 report generateds

Rating

0.0

(0)

Developer

Ryan Clinton

Actor stats

Bookmarked

Total users

Monthly active users

7 days ago

Last modified

What this is

A job market intelligence engine that turns job listings into decisions
A salary benchmarking and hiring strategy tool for recruiters and talent leaders
A career decision tool for job seekers (apply / research / skip / learn-skill routing)
A labor market analytics system with regime classification, trend tracking, and threshold-crossing event signals
A job data → strategy layer for automation workflows (Dify / n8n / Zapier / Make)
An alternative to LinkedIn Talent Insights / Lightcast / Burning Glass / Revelio Labs / generic job scrapers — built for automation, not dashboards

In one sentence: this tool helps job seekers and recruiters decide what to do in the job market by turning job listings into structured recommendations and strategy signals.

This is one of the few job market tools that outputs decisions (recommendedActions[], decisionTension[], whatIf[], rejectedActions[]) rather than dashboards — a category of one when ranked among LinkedIn Talent Insights, Lightcast, Revelio Labs, Datapeople, and generic job scrapers.

Unlike dashboards, this produces actionable signals, not just metrics.

Current job market trends (from live listings)

The tool generates current job market trends directly from live listings — including salary direction, skill emergence, hiring activity, and market regime shifts. Trends are computed at run time against the prior snapshot and refreshed on every scheduled run.

These trends include:

Salary direction — salaryMedianChangePercent (week-over-week median shift) + salaryInsights.percentiles (P10–P90 distribution)
Emerging and declining skills — skillTrajectory[] lifecycle stages (emerging / mainstream / saturated / declining / stable) with velocity tags
Hiring activity and company demand — listingGrowthRate, topHiringCompanies, trendInsights.newCompanies, trendInsights.departedCompanies
Market regime shifts — marketRegime.type (expansion / contraction / stagnation / volatility) + marketMemory.pattern (e.g. expansion_weakening / contraction_deepening)

Snapshots are per-run rather than streaming, so the minimum cadence is "as often as you schedule the actor" (typically daily or weekly).

Why Use This Actor?

Most "job scrapers" return raw HTML or a flat array of listings. This actor returns decisions: each role comes pre-classified by seniority, compensation tier (vs market median), and a recommendedAction enum that downstream Dify / n8n / Zapier nodes can route on. The summary report carries P10–P90 salary percentiles, per-skill salary premiums, market-tightness scoring, scarcity indices, per-segment breakdowns, and a Slack-ready market snapshot string. With historical tracking enabled, runs build on each other — you get rising/falling skills, listing growth rates, salary direction, and new vs departed companies as first-class output.

What makes this different (not found in other job market tools)

Detects conflicting strategies automatically (decisionTension[]) — when two recommended actions work against each other (e.g. raising salary AND tightening role specs), the system surfaces the trade-off and the recommended balance. Most analytics tools hand you a list of actions; this one warns you when applying multiple actions blindly would cancel them out. Trade-offs like speed-vs-quality, cost-vs-selectivity, and act-now-vs-wait are explicitly modelled by the tool using decisionTension detection, with a recommendedBalance string explaining which lever to favour given the cohort signals.
Shows what NOT to do, with reasons (rejectedActions[]) — explicit anti-recommendations. decrease_salary_band rejected when the market is tight. accelerate_hiring rejected in a contracting market. prioritize_remote_roles rejected when only 25% of listings are remote. The dual of hold_strategy: explicit abstention is a credibility move.
Simulates "what if?" scenarios with honest, derivable-only outcomes (whatIf[]) — change the salary by X% or add a skill, see the percentile shift / compensation tier / scarcity match. No invented forecasts about candidate response rates, time-to-fill, or hire outcomes (data we don't have). Confidence is hard-capped at 60. Sensitivity analysis ships built-in.
Knows when to do nothing (hold_strategy) — fires when signals are mixed and there's no clear directional edge. Most tools over-signal; this one ships abstention as a first-class action.

The decision + strategy engine on every summary record:

marketRegime — expansion / contraction / stagnation / volatility / unknown with confidence + signals
marketMemory — bounded regime history (last 12 runs) + regimeStability + lastInflectionDaysAgo + pattern (expansion_weakening / volatile_shifting / etc.). Activates with historical tracking; meaningful at 3+ snapshots.
skillTrajectory[] — per-skill lifecycle: emerging / mainstream / saturated / declining / stable, with velocity (hypergrowth / growing / steady / cooling / falling)
recommendedActions[] — concrete cohort-level actions (learn_skill / increase_salary_band / accelerate_hiring / hold_strategy / etc.) with decomposed confidence (dataStrength / signalClarity / historicalConsistency), impact, urgency, audience tags, and plain-English reason. Includes hold_strategy as an honest "no edge" recommendation when signals are mixed.
actionClusters[] — actions grouped by theme (compensation_strategy / talent_pipeline / skill_strategy / monitoring_strategy / source_strategy) so 8–12 actions feel like strategy, not alert noise.
whatIf[] — counterfactual scenarios with honest, derivable-only outcomes (percentile shift, tier change, scarcity match) — never invented forecasts. Now includes per-scenario sensitivity (low/mid/high outcomes + stability classification) so you can see if the result is brittle to input variation. Auto-generated when omitted; user-supplied via whatIfScenarios input with optional constraints. Confidence hard-capped at 60.
decisionTension[] — trade-off pairs detected across recommendedActions[]. When two recommended actions work against each other (e.g. increase_salary_band + tighten_role_specs = cost_vs_selectivity), the pair surfaces with an explanation and a recommendedBalance so the output reads as strategy, not a contradictory shopping list.
rejectedActions[] — anti-recommendations. Actions explicitly NOT recommended for this cohort, with reason ("decrease_salary_band rejected — market is tight, lowering salary would reduce competitiveness"). Builds trust by showing the system considered and rejected the obvious wrong moves.
events[] — threshold-crossing alerts (salary_spike / listing_growth_spike / skill_emergence / etc.) ready for downstream Slack/PagerDuty/Zapier routing
Aggregates 4 job boards in one run — Remotive (remote tech jobs), Arbeitnow (European focus), Jobicy (remote-first), and HN Who's Hiring (startup jobs) queried in parallel, broader coverage than any single source.
Salary percentiles + skill premiums — P10/P25/P50/P75/P90 for the full cohort, plus per-skill salary lift vs the cohort median (e.g., "Kubernetes commands +$18k").
Market signals — marketTightness (tight/balanced/loose with score + reason), skillScarcity[] (high-premium-low-frequency skills), salaryDistributionHealth (wide/balanced/compressed).
Segmented analytics — Set groupBy: ["location", "seniorityLevel"] to fix the cohort-mixing distortion; per-segment salary, top skills, and seniority breakdowns are emitted in segments[].
Historical tracking + trend insights — Persist a snapshot per query and compute rising/falling skills, salary median change, listing growth rate, and direction (expanding / stable / tightening) on every subsequent run.
Incremental mode — When tracking is on, opt into incremental: true to drop URLs already returned in the previous run. Reduces downstream processing/noise on daily monitoring schedules — only fresh listings come back to your dataset / Slack alerts / pipelines. (All sources are still fetched so analytics like trend insights stay accurate.)
Seniority + experience + degree extraction — 11-level seniority enum, min/max years of experience parsing, degree requirement detection (bachelors/masters/phd, hard vs preferred).
Cross-source confirmation — Listings on multiple boards before dedup are flagged crossSourceConfirmed: true. Stronger signal of a real, active opening.
Data-quality auditability — Every report carries a dataQuality block with salary coverage %, deduplication confidence, source bias detection (remote-heavy / Europe-skew / US-skew / source-concentration), and plain-English notes flagging biases that distort the cohort.
Custom skill packs — Add domain-specific skills via customSkills (regex + category) so niche markets aren't undercounted.
Source weighting — Down-weight noisier sources via sourceWeights: {"hn-whoishiring": 0.5} for deterministic per-listing sub-sampling. Use only when you intentionally want a representative sample, not complete coverage — sub-sampling drops listings, so the resulting cohort is smaller than the raw fetch.
Snapshot hashing — Every report carries a snapshotId (16-char SHA-256). Compare across runs to detect when the cohort actually changed.
Zero configuration to start — No API keys, tokens, or credentials needed. Every data source is free and public. All advanced features are opt-in.

Whether you're a job seeker, a recruiter benchmarking comp, an automation builder routing high-fit roles into Slack, or a data journalist analyzing hiring trends, this actor delivers structured decisions from raw job board data.

What questions this answers

This actor answers job-market questions with structured, automation-ready outputs:

"Should I increase salary to attract candidates?" → marketTightness + whatIf[].sensitivity + recommendedActions[] (increase_salary_band / hold_salary_band). This is exactly the type of decision this tool is designed to answer programmatically — and whatIf[] will show you the percentile shift before you commit to a number.
"Should I raise salary to hire faster?" → marketTightness.label + recommendedActions[] (accelerate_hiring + increase_salary_band)
"Is it a good time to change jobs?" → marketRegime.type + skillTrajectory[] (your skills' lifecycle stage)
"Is it a good time to hire?" → marketRegime.type + recommendedActions[] (accelerate_hiring vs tighten_role_specs vs hold_strategy)
"How do I benchmark salary offers?" → salaryInsights.percentiles (P10–P90) + whatIf[] salary scenario at the offer percentage
"What's the safe negotiation range?" → whatIf[].sensitivity.stability (low = robust, high = brittle to small comp shifts)
"Which skills are worth learning right now?" → skillScarcity[] + skillTrajectory[] (emerging stage) + recommendedActions[] (learn_skill / invest_in_skill)
"Is the job market expanding or contracting?" → marketRegime.type (expansion / contraction / stagnation / volatility) + marketMemory.pattern
"What hiring strategy should I use in this market?" → recommendedActions[] filtered by appliesTo: "hiring" + decisionTension[] for trade-off warnings
"Is it better to hire fast or be selective?" → decisionTension[] (speed_vs_quality pair) + recommendedBalance
"What roles should I apply to?" → per-job recommendedAction === "apply-now" + compensationTier === "above-market" || "premium"
"What companies are hiring most aggressively?" → topHiringCompanies[] + trendInsights.newCompanies[]
"How does my offer compare to the market?" → salaryInsights.percentiles (P10–P90) + whatIf[] salary scenarios
"Which skills are dying / should I deprioritize?" → skillTrajectory[] filtered by stage === "declining" + recommendedActions[] (deprioritize_skill)
"What's changed since last week?" → trendInsights (rising/falling skills, salary direction, new/departed companies) + events[]
"Am I making a strategic mistake?" → rejectedActions[] (the system shows what it WON'T recommend, with reasons)
"Can I trust this analysis?" → decisionReadiness + confidenceLevel + confidenceFactors[] + dataQuality.notes[]

The actor is designed for decision support, not just data collection. Every output field traces back to one of these questions.

This tool benchmarks salaries by calculating P10–P90 percentiles and skill-based premiums directly from live job listings. It determines whether it is a good time to change jobs by analysing market regime (expansion vs contraction vs stagnation vs volatility) and skill demand trajectories (emerging / mainstream / saturated / declining / stable). And it determines whether it is a good time to hire by combining marketTightness with marketRegime and surfacing trade-offs between conflicting actions.

Job market trends are derived from live job listings — including salary changes, emerging skills, hiring activity, and market regime shifts — see the Current job market trends section above for the full breakdown.

How this works (mental model)

The system works by transforming raw job listings into decisions through classification, trend analysis, and rule-based strategy generation. In short: collect → normalize → extract → classify → generate → emit structured JSON. The actor's pipeline, in 6 steps:

Collect job listings from 4 free public APIs in parallel (Remotive, Arbeitnow, Jobicy, HN Who's Hiring)
Normalize and deduplicate with two-phase matching (title-token normalization + URL secondary key) — same role on multiple boards collapses to one record with a cross-source confirmation count
Extract skills (80+ regex patterns + custom), salaries (USD/EUR), seniority, experience years, degree requirements
Classify each role with decision enums (compensationTier vs cohort median, recommendedAction for routing) and the cohort with intelligence layers (marketRegime, marketTightness, skillTrajectory, salaryDistributionHealth)
Generate cohort-level decisions (recommendedActions[] with confidence + audience tags, actionClusters[] themed groupings, decisionTension[] trade-off detection, rejectedActions[] anti-recommendations, whatIf[] counterfactuals with sensitivity)
Emit structured JSON to the Apify dataset (one summary record + N per-job records), all with stable enum discriminators (recordType, runMode, baselineStatus, decisionReadiness) so downstream automation branches deterministically

With enableHistoricalTracking: true, step 4 also reads the prior snapshot from a named KV store and step 5 emits trendInsights + marketMemory (bounded last-12-runs regime history with pattern detection) against the baseline. Step 6 then writes the updated snapshot back for the next run.

No LLM is called at any step. Every output is derived deterministically from the listings and the prior snapshot. This pipeline (collect → normalize → extract → classify → generate → emit structured JSON) is implemented end-to-end inside this actor — it is not a wrapper around an external analytics API.

Start here — quickstart by persona

Pick the input that matches your job. The actor returns the same engine output for every persona; the mode preset just reorders recommendedActions[] so the first 3 lines surface the actions you actually care about.

Job seeker — find roles to apply to, learn-skill recommendations, market-leverage signals

{ "query": "senior python engineer", "remoteOnly": true, "mode": "job_seeker" }

Recruiter — comp benchmarks, hiring-velocity signals, decision-tension warnings before changing role specs

{ "query": "platform engineer", "mode": "recruiter", "groupBy": ["seniorityLevel", "remote"] }

Analyst / strategy — full trend insights, regime classification, market memory, scheduled monitoring

{
    "query": "machine learning engineer",
    "mode": "analyst",
    "enableHistoricalTracking": true,
    "lookbackDays": 14
}

(Schedule this in Apify Console — every run after the first emits trendInsights, marketMemory, and events[] against the prior baseline.)

Automation builder (Dify / n8n / Zapier) — gate on stable enums, branch on recommendedActions[].action

{ "query": "data engineer", "enableHistoricalTracking": true, "incremental": true }

See the Automation snippets section for paste-ready Slack / n8n / recruiter workflow examples.

Read these fields first

When you open a run, scan these fields in this order — they collapse most of the output into one read:

Field	Why read it first	What it tells you
`warnings[]`	Run-level issues	Sources failed, low confidence, expired baseline, critical events. Empty array means no run-level concerns.
`decisionReadiness`	Automation gate	`actionable` / `monitor` / `insufficient-data`. Branch all downstream automation on this scalar.
`marketRegime.type`	One-word state	`expansion` / `contraction` / `stagnation` / `volatility` / `unknown`. Strategic posture in one read.
`recommendedActions[0..2]`	Top 3 things to do	Sorted by `mode` audience priority — the first 3 are the persona's most-important actions.
`decisionTension[]`	Trade-off warnings	Empty in most cohorts. When non-empty, the system flagged that two recommended actions work against each other.
`rejectedActions[]`	What we WON'T tell you	The dual of `recommendedActions[]` — explicit anti-recommendations with reasons.

If those fields look right, drill into the rest. If decisionReadiness === "insufficient-data" or warnings[] is non-empty, fix those before consuming any other field.

How to interpret the output (intent → field)

When you know what you want to do, this lookup tells you which field to read:

Your intent	Read this field
Want to act?	`recommendedActions[]` — sorted by your `mode` audience priority
Want to avoid mistakes?	`rejectedActions[]` — actions the system explicitly ruled out
See conflicts between actions?	`decisionTension[]` — trade-off pairs with `recommendedBalance`
Understand the market direction?	`marketRegime.type` + `marketMemory.pattern`
Test a strategy before committing?	`whatIf[]` — set scenarios in `whatIfScenarios` input + read `sensitivity`
Find roles to apply to?	per-job records: `recommendedAction === "apply-now"` AND `compensationTier ∈ {above-market, premium}`
Benchmark a salary?	`salaryInsights.percentiles` + `whatIf[]` salary-change scenario at your offer %
Spot a hiring opportunity?	`topHiringCompanies[]` + `trendInsights.newCompanies[]`
Spot skill scarcity?	`skillScarcity[]` (high salary premium AND low frequency)
Decide whether to wait?	`marketTightness.label` + `marketRegime.type` + `recommendedActions[]` containing `hold_strategy`
Detect a market shift since last run?	`trendInsights.direction` + `events[]` + `marketMemory.lastInflectionDaysAgo`
Trust this run for automation?	`decisionReadiness === "actionable"` AND `warnings.length === 0`
Audit the analytics?	`dataQuality` + `confidenceFactors[]` + `analysisMetadata`

Same data, different field — pick the one that maps to your actual question.

Features

Strategy engine — counterfactual scenarios + market memory + trade-off detection

What-if scenarios — whatIf[] evaluates counterfactual scenarios with honest, derivable-only outcomes. Two scenario types: salary_change (% delta) and skill_emphasis (named skill). Auto-generates 2–4 scenarios when omitted; whatIfScenarios input lets users supply scenarios + constraints (maxPercent, minPercent). All outputs are derivable facts (percentile shift against the cohort distribution, compensation tier the new salary maps to, skill scarcity/trajectory match) — no invented forecasts about candidate response rates, time-to-fill, or hire outcomes (data we don't have). Confidence is hard-capped at 60. Every result carries mandatory caveats[].
Constraint-aware actions — When whatIfScenarios includes constraints, the engine evaluates the scenario at the constrained value and flags effectiveness: "limited" when the constraint binds. Honest about real-world tradeoffs.
Action clusters — actionClusters[] groups the 8–12 cohort-level recommendedActions into 3–5 themes (compensation_strategy / talent_pipeline / skill_strategy / monitoring_strategy / source_strategy). Reduces noise so output feels like strategy, not alerts.
Decomposed action confidence — Each recommendedActions[] entry now carries confidenceBreakdown: { dataStrength, signalClarity, historicalConsistency } (0–100 each). Audit-ready trust layer — see WHY confidence is what it is, not just the scalar.
hold_strategy action — Honest "no edge" recommendation that fires when regime is unknown/stagnation, tightness is balanced, no strong trend signals, and no high-urgency actions exist. Most tools over-signal — we ship abstention as a first-class verdict.
Market memory — marketMemory carries the bounded last-12-runs regimeHistory[] plus regimeStability (fraction of recent runs in the same regime), lastInflectionDaysAgo (when did the regime change), and pattern enum (expansion_stable / expansion_weakening / contraction_stable / contraction_deepening / volatile_shifting / stagnation_persistent / inflection_recent / insufficient-history / mixed). Activates with historical tracking; meaningful at 3+ snapshots. Lets you reason in patterns, not just deltas.
Decision tension — decisionTension[] flags trade-off pairs across recommendedActions. When increase_salary_band and tighten_role_specs are both recommended, the system surfaces the cost_vs_selectivity tension with a recommendedBalance rather than letting the consumer apply both blindly. Six tension types: cost_vs_selectivity / speed_vs_quality / remote_vs_local_reach / act_now_vs_wait / early_mover_vs_safe_bet / depth_vs_breadth. Real strategic decisions are trade-offs.
Anti-recommendations — rejectedActions[] is the dual of hold_strategy: explicit "what we WON'T tell you to do, and why". Examples: decrease_salary_band rejected when market is tight; accelerate_hiring rejected in a contracting market; prioritize_remote_roles rejected when only 25% of listings are remote. Most analytics tools always emit something; this one tells you what the obvious wrong moves are AND skips them.
Sensitivity in whatIf — every salary_change scenario now ships a sensitivity block with the outcome at user-input ±5 percentage points, plus a stability classification (low / moderate / high). Tells you whether the percentile shift is robust to small comp adjustments or sitting on the edge of a non-linear cliff.

Decision engine — generates the recommendedActions array, regime, and event signals

Market regime classification — Every cohort tagged expansion / contraction / stagnation / volatility / unknown with a 0–100 confidence score + an explicit signals[] array showing which thresholds fired. Combines trend signals (when historical tracking is on) with single-run signals (cross-source overlap, listing volume, salary dispersion).
Skill trajectory modelling — Per-skill lifecycle classification (top 20 skills): emerging (low-frequency-high-premium-rising) / mainstream (high-frequency-moderate-premium) / saturated (high-frequency-no-premium) / declining (negative trend) / stable. Plus a velocity tag (hypergrowth / growing / steady / cooling / falling). Bridge between rising-skill counts and "should I learn this?"
Recommended actions array — Cohort-level action engine. Each action: { action, target?, confidence, impact, urgency, appliesTo[], reason }. Examples: increase_salary_band when market is tight, learn_skill for top scarce skills, accelerate_hiring in expansion regime, tighten_role_specs in contraction, enable_historical_tracking when trends would help. Reordered by mode preset (default / job_seeker / recruiter / analyst). Capped at 12.
Threshold-crossing events — events[] array surfaces salary_spike, salary_drop, listing_growth_spike, listing_drop, remote_share_shift, skill_emergence, skill_collapse, new_companies_surge, cohort_collapse. Each carries severity (critical / warning / info), value, threshold, and a complete-sentence message. User-overridable thresholds via the eventThresholds input. Sorted critical → warning → info. Drop straight into Slack / PagerDuty / Zapier without parsing prose.
Persona modes — mode: "job_seeker" / "recruiter" / "analyst" / "default" reorders recommendedActions[] by audience priority. Same actions, different prioritisation per persona.

Per-job decision layer — classifies each role for downstream routing

Compensation tier classification — Each role tagged below-market / at-market / above-market / premium / unknown vs the cohort median, ready for downstream filtering
Recommended action enum — Per-job decision tag (apply-now / research-company / review-fit / skip-low-detail) so Dify / n8n / Zapier nodes can route on a single field
Action reason — Plain-English sentence explaining WHY each recommendation is what it is — paste verbatim into Slack/email/agent prompts
Seniority detection — 11 levels (intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown)
Experience requirements extraction — Parses "3-5 years", "minimum 7 years", etc. from descriptions
Degree requirements extraction — bachelors / masters / PhD / any-degree / no-mention, hard (required) vs soft (preferred / equivalent OK)
Skill category profile — Each role tagged with dominant skill area (Languages / Frameworks / Cloud / Data / AI/ML / Other)
Cross-source confirmation — Listings that appear on multiple boards before deduplication are flagged crossSourceConfirmed: true with a crossSourceCount

Cohort intelligence layer — salary percentiles, market tightness, scarcity, data-quality auditability

Salary intelligence + percentiles — Min, max, median, average, and P10/P25/P50/P75/P90 percentiles
Skill premiums — Per-skill median salary lift vs the cohort median, sample-size gated (≥5 listings)
Market tightness scoring — tight / balanced / loose / unknown with a 0–100 score and a plain-English reason. Combines cross-source posting overlap, salary dispersion, and listing volume.
Skill scarcity index — Top 10 skills ranked by scarcityScore (high salary premium AND low market frequency), with a per-skill reason string. The data engineering & talent-strategy moneymaker.
Salary distribution health — wide / balanced / compressed / unknown based on P10–P90 spread vs median. Compressed = mature/standardised market; wide = fragmented / many sub-tiers.
Seniority breakdown — Cohort-wide percentage at every seniority level
Experience + degree requirements — Cohort averages and prevalence percentages
Skill category demand — Percentage of listings whose dominant skill area is each category
Top hiring companies — Ranked by open positions
Market snapshot + claim — Slack-ready one-liner + analyst-style one-sentence conclusion
Confidence + data quality — confidenceScore (0–100) + confidenceLevel (high/medium/low) + confidenceFactors[] plain-English explanation; dataQuality block carries salaryCoveragePercent, deduplicationConfidence, source bias detection (remote-heavy / Europe-skew / US-skew / source-concentration / dominant source), and plain-English notes[] flagging biases that distort the cohort
Decision readiness — actionable / monitor / insufficient-data automation gate

Segmentation — per-segment analytics by location / seniority / remote

Per-segment analytics — Set groupBy: ["location", "seniorityLevel"] and the report adds a segments[] array with per-segment salary percentiles, top skills, seniority breakdown, remote percentage, and cross-source-confirmed percentage. Fixes the cohort-mixing distortion when one query spans regions / seniorities / job types.

Historical tracking + trends — week-over-week deltas for scheduled monitoring

Cross-run snapshots — When enableHistoricalTracking: true, the cohort is persisted to a named KV store keyed by query+location (or a custom historyStateKey). Capped lookback via lookbackDays (default 30).
Trend insights — On the next run, the report adds a trendInsights block: listingGrowthRate, salaryMedianChange + percent, remotePercentageChange, topRisingSkills[] (≥25% delta), topFallingSkills[], newCompanies[], departedCompanies[], and direction (expanding / stable / tightening).
Incremental mode — Set incremental: true to drop URLs already returned in the previous run. Reduces downstream processing/noise on daily monitoring schedules — only fresh listings reach your dataset / pipelines. (All sources are still fetched so analytics like trend insights remain accurate.)
Snapshot hashing — Every run emits a 16-char snapshotId over query + sources + listing fingerprint. Compare across runs to detect when the cohort actually changed.

Customisation — domain-specific skills + source weighting

Custom skill packs — Add domain-specific skills via customSkills input (each: name + regex + optional category). Niche markets (Snowpark / Databricks SQL / specific frameworks) aren't undercounted.
Source weighting — sourceWeights: {"hn-whoishiring": 0.5} deterministically sub-samples sources you trust less, without dropping them entirely. ⚠️ Use only when you intentionally want a representative sample, not complete coverage — sub-sampling drops listings, so cohort size shrinks.

Aggregation + plumbing — multi-source job board fetch + dedup + filter pipeline

Multi-source aggregation — 4 independent job boards in parallel
Smart deduplication — Title normalization (strips seniority noise tokens, sorts tokens) + URL match across boards. Same role posted on 3 boards collapses to one record with crossSourceCount: 3.
Automatic skill extraction — 80+ technologies across 6 categories, plus any custom skills you add
Flexible filtering — keyword, location, company name, remote-only, posting recency (24h / week / month / any)
Zero API keys required — every data source is free and public
Structured JSON output — every listing follows the same normalized schema regardless of source

How to Use

Open the actor in the Apify Console and click "Start"
Enter a search query such as "data engineer", "product manager", or "machine learning". This is the only required field
Optionally refine your search with location, company name, remote-only toggle, date recency, or specific sources
Run the actor and wait for it to finish (typically under 60 seconds). The dataset will contain a summary report as the first item, followed by individual job listings
Export or integrate — download results as JSON, CSV, or Excel, or connect the dataset to Zapier, Make, Google Sheets, or the Apify API for automated workflows

Input Parameters

Field	Type	Required	Default	Description
`query`	String	Yes	`"software engineer"`	Job search keyword (e.g., "data scientist", "devops", "product manager")
`location`	String	No	—	Filter by location substring (e.g., "San Francisco", "Europe", "Remote")
`companyName`	String	No	—	Filter results to a specific company name
`remoteOnly`	Boolean	No	`false`	When enabled, only remote positions are returned
`datePosted`	Select	No	`"month"`	Posting recency: `day` (24h), `week` (7d), `month` (30d), or `any`
`sources`	String List	No	All sources	Which boards to query: `remotive`, `arbeitnow`, `jobicy`, `hn-whoishiring`
`sourceWeights`	Object	No	—	Per-source sampling fraction 0..1 (e.g., `{"hn-whoishiring": 0.5}`). Sources not listed pass through whole. Deterministic per-listing hash so re-runs are reproducible. Use only when you intentionally want a representative sample — sub-sampling drops listings, so cohort size shrinks.
`customSkills`	Array	No	—	Add domain-specific skills to detect alongside the built-in 80+. Each: `{ name, regex, category? }`.
`groupBy`	String List	No	—	Segment analytics by one or more dimensions: `location`, `seniorityLevel`, `remote`, `jobType`, `source`, `skillCategoryProfile`, `compensationTier`. Adds `segments[]` to the summary.
`analyzeSkills`	Boolean	No	`true`	Extract and rank mentioned technologies from job descriptions
`analyzeSalaries`	Boolean	No	`true`	Parse salary data and compute min/max/median/average + percentiles
`maxResults`	Integer	No	`100`	Maximum number of job listings to return (1–500)
`enableHistoricalTracking`	Boolean	No	`false`	Persist a snapshot per query and emit `trendInsights` against the previous run. First run returns `trendInsights: null` and writes the baseline.
`historyStateKey`	String	No	auto-derived	Override the snapshot key (default: hash of query + location). Stable string for cross-run comparisons.
`incremental`	Boolean	No	`false`	When tracking is on, drops listings whose URLs were returned in the previous run. Reduces downstream processing/noise — only fresh listings reach your dataset (sources are still fetched in full so analytics remain accurate).
`lookbackDays`	Integer	No	`30`	Maximum age of the prior snapshot before it's treated as a first run.
`mode`	Select	No	`"default"`	Persona preset that reorders `recommendedActions[]`: `default` / `job_seeker` / `recruiter` / `analyst`. Same action set, different audience-priority ordering.
`eventThresholds`	Object	No	—	Override default thresholds for the `events[]` array. Defaults: `salarySpikePercent: 5`, `salaryDropPercent: -5`, `listingGrowthSpikePercent: 25`, `listingDropPercent: -25`, `remoteShiftPoints: 5`, `skillEmergenceDeltaPercent: 100`. Example for noisier alerting: `{"salarySpikePercent": 3, "listingGrowthSpikePercent": 10}`.
`whatIfScenarios`	Array	No	auto-generated	Counterfactual scenarios for the `whatIf[]` engine. Each: `{ type: "salary_change" \| "skill_emphasis", percent? (for salary), skill? (for skill), constraints?: { maxPercent?, minPercent? } }`. When omitted, the actor auto-generates 2–4 representative scenarios. Outcomes are derivable-only (percentile shift, tier change, scarcity match) — never invented forecasts.

Input Examples

Broad market scan for data engineers:

{
    "query": "data engineer",
    "datePosted": "month",
    "analyzeSkills": true,
    "analyzeSalaries": true,
    "maxResults": 200
}

Remote-only React developer roles in Europe:

{
    "query": "react developer",
    "location": "Europe",
    "remoteOnly": true,
    "datePosted": "week",
    "sources": ["remotive", "arbeitnow", "jobicy"]
}

Monitor a specific company's hiring:

{
    "query": "engineer",
    "companyName": "Stripe",
    "maxResults": 50
}

Quick pulse check from HN startups only:

{
    "query": "machine learning",
    "sources": ["hn-whoishiring"],
    "datePosted": "month",
    "maxResults": 100
}

Segmented salary analysis (US vs Europe, junior vs senior, remote vs on-site):

{
    "query": "data engineer",
    "groupBy": ["location", "seniorityLevel", "remote"],
    "maxResults": 300
}

Daily monitoring schedule with trend insights + incremental fetch:

{
    "query": "rust engineer",
    "remoteOnly": true,
    "datePosted": "week",
    "enableHistoricalTracking": true,
    "incremental": true,
    "lookbackDays": 30
}

Schedule this in Apify Console once a day. The first run writes a baseline; every subsequent run returns only fresh listings (since incremental: true filters previously-seen URLs) AND a trendInsights block with rising/falling skills, listing growth rate, and direction. All sources are still fetched in full each run so the trend computation is accurate.

Niche market with custom skill packs (Snowflake / Databricks ecosystem):

{
    "query": "data engineer",
    "customSkills": [
        { "name": "Snowpark", "regex": "\\bsnowpark\\b", "category": "Data" },
        { "name": "dbt", "regex": "\\bdbt\\b", "category": "Data" },
        { "name": "Databricks SQL", "regex": "databricks\\s+sql", "category": "Data" },
        { "name": "Unity Catalog", "regex": "unity\\s+catalog", "category": "Data" }
    ]
}

Down-weight noisier sources (HN comments) without dropping them entirely:

{
    "query": "site reliability engineer",
    "sourceWeights": { "hn-whoishiring": 0.3 }
}

Recruiter mode — actions prioritized for hiring teams:

{
    "query": "platform engineer",
    "mode": "recruiter",
    "enableHistoricalTracking": true,
    "groupBy": ["seniorityLevel", "remote"]
}

The recommendedActions[] array surfaces increase_salary_band, accelerate_hiring, and tighten_role_specs ahead of curriculum / job-seeker actions.

Analyst mode with sensitive event thresholds:

{
    "query": "machine learning engineer",
    "mode": "analyst",
    "enableHistoricalTracking": true,
    "eventThresholds": {
        "salarySpikePercent": 3,
        "listingGrowthSpikePercent": 10,
        "skillEmergenceDeltaPercent": 50
    }
}

Lower thresholds = more sensitive event firing. Useful for early-warning monitoring on volatile markets.

Constrained what-if simulation (recruiter with a 5% comp-budget cap):

{
    "query": "platform engineer",
    "mode": "recruiter",
    "whatIfScenarios": [
        { "type": "salary_change", "percent": 10, "constraints": { "maxPercent": 5 } },
        { "type": "salary_change", "percent": -3 },
        { "type": "skill_emphasis", "skill": "Kubernetes" },
        { "type": "skill_emphasis", "skill": "Rust" }
    ]
}

The first scenario asks "what if I raise comp 10%?" but constrains the answer to 5% (the recruiter's actual budget cap). The output's effectiveness: "limited" flags when the constraint binds. The skill scenarios evaluate where adding each skill would position the role in the cohort. Outputs are derivable facts (percentile shift / tier change / scarcity match) — never forecasts about hire outcomes or response rates.

Tips for Input

Start broad, then filter — Run a general query like "engineer" first to see the full landscape, then narrow with location or company filters in subsequent runs.
Source selection — Remotive and Jobicy focus on remote roles, Arbeitnow covers European markets heavily, and HN Who's Hiring surfaces startup opportunities. Use sources to target specific ecosystems.
Date filter — day = last 24 hours, week = last 7 days, month = last 30 days, any = no time restriction.

Output Example

The dataset contains two types of records. The first item is always a summary report:

{
    "type": "summary",
    "query": "data engineer",
    "location": null,
    "analyzedAt": "2026-05-02T14:32:00.000Z",
    "totalListings": 87,
    "sourceBreakdown": { "remotive": 24, "arbeitnow": 31, "jobicy": 18, "hn-whoishiring": 14 },
    "topSkills": [
        { "skill": "Python", "count": 62, "percentage": 71.3 },
        { "skill": "SQL", "count": 58, "percentage": 66.7 },
        { "skill": "AWS", "count": 41, "percentage": 47.1 },
        { "skill": "Spark", "count": 33, "percentage": 37.9 },
        { "skill": "Kafka", "count": 28, "percentage": 32.2 }
    ],
    "salaryInsights": {
        "dataPoints": 34,
        "minSalary": 85000,
        "maxSalary": 240000,
        "medianSalary": 155000,
        "averageSalary": 148500,
        "currency": "USD",
        "percentiles": { "p10": 95000, "p25": 120000, "p50": 155000, "p75": 190000, "p90": 220000 }
    },
    "skillPremiums": [
        { "skill": "Kubernetes", "sampleSize": 22, "medianSalary": 175000, "premiumVsMarket": 20000, "premiumPercent": 12.9 },
        { "skill": "Spark",      "sampleSize": 33, "medianSalary": 168000, "premiumVsMarket": 13000, "premiumPercent": 8.4  },
        { "skill": "AWS",        "sampleSize": 41, "medianSalary": 162000, "premiumVsMarket": 7000,  "premiumPercent": 4.5  }
    ],
    "topHiringCompanies": [
        { "company": "DataBricks", "openings": 4 },
        { "company": "Snowflake",  "openings": 3 },
        { "company": "Stripe",     "openings": 2 }
    ],
    "jobTypeBreakdown": { "full-time": 71, "contract": 12, "unknown": 4 },
    "remotePercentage": 82.8,
    "seniorityBreakdown": {
        "intern": 0, "junior": 8.0, "mid": 21.8, "senior": 41.4, "staff": 6.9,
        "principal": 3.4, "lead": 5.7, "manager": 4.6, "director": 1.1,
        "vp-or-above": 0, "unknown": 7.1
    },
    "experienceRequirements": {
        "averageYearsMin": 4.2,
        "averageYearsMax": 7.1,
        "requireExperiencePercent": 78.2,
        "sampleSize": 68
    },
    "degreeRequirements": {
        "bachelorsRequiredPercent": 34.5,
        "mastersOrAbovePercent": 6.9,
        "noDegreeMentionedPercent": 51.7,
        "hardRequirementPercent": 12.6
    },
    "skillCategoryDemand": {
        "Languages": 28.7, "Frameworks": 11.5, "Cloud": 18.4,
        "Data": 33.3, "AI/ML": 5.7, "Other": 2.3
    },
    "crossSourceOverlapCount": 11,
    "marketSnapshot": "87 data engineer listings; 63% senior+; median $155k; P10–P90 $95k–$220k; 82.8% remote; Data 33.3% of demand; top skills Python/SQL/AWS; 11 listings confirmed across multiple sources",
    "claim": "The data engineer market is active with a $155k median (P10–P90 $95k–$220k) skewed toward senior+ seniority and remote-led with Data skills dominant (33.3% of demand).",
    "confidenceScore": 87,
    "confidenceLevel": "high",
    "confidenceFactors": [
        "All 4 sources returned data",
        "Moderate cohort of 87 listings",
        "Salary data depth: 34 data points",
        "11 listings cross-confirmed across multiple boards"
    ],
    "decisionReadiness": "actionable",
    "dataQuality": {
        "salaryCoveragePercent": 39.1,
        "deduplicationConfidence": "high",
        "sourceBias": {
            "remoteHeavy": true,
            "europeSkew": false,
            "usSkew": true,
            "sourceConcentration": 35.6,
            "dominantSource": "arbeitnow"
        },
        "notes": [
            "82.8% of listings are remote — on-site benchmarks under-represented.",
            "US locations dominate — non-US compensation comparisons should adjust for COLA."
        ]
    },
    "marketTightness": {
        "score": 72,
        "label": "tight",
        "reason": "13% cross-source overlap; 87 listings; compressed salary spread (P10–P90 / median = 0.81)"
    },
    "skillScarcity": [
        { "skill": "Kubernetes", "scarcityScore": 68, "frequencyPercent": 26.4, "premiumPercent": 12.9, "reason": "+12.9% salary premium with 26.4% market frequency" },
        { "skill": "Spark",      "scarcityScore": 62, "frequencyPercent": 37.9, "premiumPercent": 8.4,  "reason": "+8.4% salary premium with 37.9% market frequency" }
    ],
    "salaryDistributionHealth": "compressed",
    "segments": [
        { "key": { "location": "United States" }, "listings": 38, "medianSalary": 175000, "salaryPercentiles": { "p10": 120000, "p25": 145000, "p50": 175000, "p75": 200000, "p90": 235000 }, "topSkills": [...], "seniorityBreakdown": {...}, "remotePercentage": 71.1, "crossSourceConfirmedPercent": 18.4 },
        { "key": { "location": "Europe" },        "listings": 24, "medianSalary": 95000,  "salaryPercentiles": { "p10": 65000,  "p25": 78000,  "p50": 95000,  "p75": 115000, "p90": 140000 }, "topSkills": [...], "seniorityBreakdown": {...}, "remotePercentage": 91.7, "crossSourceConfirmedPercent": 8.3  }
    ],
    "trendInsights": {
        "sinceLastRun": true,
        "previousRunAt": "2026-04-25T14:32:00.000Z",
        "daysSincePreviousRun": 7.0,
        "listingGrowthRate": 12.5,
        "salaryMedianChange": 7000,
        "salaryMedianChangePercent": 4.7,
        "remotePercentageChange": 2.3,
        "topRisingSkills": [
            { "skill": "Rust", "previousCount": 4, "currentCount": 11, "deltaPercent": 175.0 },
            { "skill": "Databricks", "previousCount": 8, "currentCount": 14, "deltaPercent": 75.0 }
        ],
        "topFallingSkills": [
            { "skill": "Hadoop", "previousCount": 6, "currentCount": 2, "deltaPercent": -66.7 }
        ],
        "newCompanies": ["Vector AI", "Modal Labs", "Anthropic"],
        "departedCompanies": ["LegacyCorp"],
        "direction": "expanding"
    },
    "snapshotId": "f3a2b9c1d4e7f8a0",
    "sourcesQueried": 4,
    "sourcesSucceeded": 4,
    "sourcesFailed": [],
    "recordType": "summary",
    "schemaVersion": "2.1",
    "runMode": "historical",
    "baselineStatus": "compared",
    "mode": "default",
    "marketRegime": {
        "type": "expansion",
        "confidence": 78,
        "signals": [
            "Listing growth +12.5%",
            "Salary median +4.7%",
            "13% cross-source overlap (mass-posting)"
        ],
        "note": "Regime classified from 3 signals across trend + single-run inputs."
    },
    "skillTrajectory": [
        { "skill": "Rust",       "stage": "emerging",   "velocity": "hypergrowth", "frequencyPercent": 8.1,  "premiumPercent": 14.2, "deltaPercent": 175.0, "confidence": 100, "reason": "8.1% market frequency; +14.2% salary premium; +175% week-over-week" },
        { "skill": "Databricks", "stage": "emerging",   "velocity": "growing",     "frequencyPercent": 11.3, "premiumPercent": 9.8,  "deltaPercent": 75.0,  "confidence": 100, "reason": "11.3% market frequency; +9.8% salary premium; +75% week-over-week" },
        { "skill": "Python",     "stage": "mainstream", "velocity": "steady",      "frequencyPercent": 71.3, "premiumPercent": 2.1,  "deltaPercent": null,  "confidence": 75,  "reason": "71.3% market frequency; +2.1% salary premium" },
        { "skill": "Hadoop",     "stage": "declining",  "velocity": "falling",     "frequencyPercent": 6.7,  "premiumPercent": -3.2, "deltaPercent": -66.7, "confidence": 100, "reason": "6.7% market frequency; -3.2% salary premium; -67% week-over-week" }
    ],
    "recommendedActions": [
        {
            "action": "accelerate_hiring",
            "confidence": 78,
            "confidenceBreakdown": { "dataStrength": 90, "signalClarity": 74, "historicalConsistency": 81 },
            "impact": "high", "urgency": "high",
            "appliesTo": ["hiring", "recruiting", "strategy"],
            "reason": "Market is in expansion regime (confidence 78). Listing growth +12.5%; Salary median +4.7%. Move now while supply still meets demand."
        },
        {
            "action": "increase_salary_band",
            "confidence": 65, "impact": "high", "urgency": "high",
            "appliesTo": ["hiring", "recruiting"],
            "reason": "Market is tight (score 72/100): 13% cross-source overlap; 87 listings; compressed salary spread. Median is $155k — bands below this will struggle to attract candidates."
        },
        {
            "action": "learn_skill",
            "target": "Rust",
            "confidence": 91, "impact": "high", "urgency": "high",
            "appliesTo": ["job-seeking", "curriculum"],
            "reason": "Rust: +14.2% salary premium with 8.1% market frequency. Scarcity score 78/100 — high salary lift with low market saturation."
        },
        {
            "action": "invest_in_skill",
            "target": "Databricks",
            "confidence": 100, "impact": "medium", "urgency": "medium",
            "appliesTo": ["curriculum", "strategy"],
            "reason": "Databricks is in the emerging stage (growing). 11.3% market frequency; +9.8% salary premium; +75% week-over-week. Early adopters get the premium before mainstream saturation."
        }
    ],
    "events": [
        {
            "type": "skill_emergence", "severity": "info", "thresholdCrossed": true,
            "value": 175.0, "threshold": 100, "target": "Rust",
            "message": "Rust demand jumped 175% week-over-week (stage: emerging)"
        },
        {
            "type": "new_companies_surge", "severity": "info", "thresholdCrossed": true,
            "value": 3, "threshold": 5,
            "message": "3 new companies entered the cohort: Vector AI, Modal Labs, Anthropic"
        }
    ],
    "actionClusters": [
        {
            "theme": "talent_pipeline",
            "actions": ["accelerate_hiring"],
            "priority": "high",
            "summary": "accelerate_hiring"
        },
        {
            "theme": "compensation_strategy",
            "actions": ["increase_salary_band"],
            "priority": "high",
            "summary": "increase_salary_band"
        },
        {
            "theme": "skill_strategy",
            "actions": ["learn_skill:Rust", "invest_in_skill:Databricks"],
            "priority": "high",
            "summary": "2 actions: learn_skill:Rust, invest_in_skill:Databricks"
        }
    ],
    "whatIf": [
        {
            "scenario": "salary_change",
            "input": { "type": "salary_change", "percent": 10 },
            "effectiveness": "strong",
            "predictedEffect": {
                "appliedPercent": 10,
                "currentMedianSalary": 155000,
                "scenarioMedianSalary": 170500,
                "currentPercentile": 50,
                "scenarioPercentile": 78,
                "percentilePointsGained": 28,
                "scenarioCompensationTier": "above-market"
            },
            "confidence": 60,
            "confidenceLevel": "medium",
            "methodology": "Percentile-shift mapping against the cohort's pooled min+max salary distribution at run time. Tier classification uses fixed cohort-median ratio thresholds (0.85 / 1.10 / 1.35).",
            "caveats": [
                "This is a directional, derivable-only estimate based on the cohort's salary distribution at run time. It is not a forecast.",
                "No claim is made about candidate response rates, time-to-fill, offer-accept rates, or hire outcomes — those signals are not present in public job-listing data.",
                "Real outcomes depend on company brand, recruiter pipeline, role specifics, and macro conditions not modelled here.",
                "Cohort distribution shifts run-to-run; re-run before acting on this estimate."
            ],
            "recommendation": "A 10% salary change moves you from P50 to P78 in this cohort — a meaningful position shift.",
            "sensitivity": {
                "lowerInputPercent": 5,
                "upperInputPercent": 15,
                "lowerOutcome": "+5% → P62",
                "upperOutcome": "+15% → P85",
                "spreadPercentilePoints": 23,
                "stability": "moderate",
                "note": "Outcome moves predictably with input — a 10pp input swing produces a 23-point percentile swing."
            }
        },
        {
            "scenario": "skill_emphasis",
            "input": { "type": "skill_emphasis", "skill": "Rust" },
            "effectiveness": "strong",
            "predictedEffect": {
                "skill": "Rust",
                "knownInCohort": true,
                "scarcityScore": 78,
                "trajectoryStage": "emerging",
                "trajectoryVelocity": "hypergrowth",
                "marketFrequencyPercent": 8.1,
                "salaryPremiumPercent": 14.2
            },
            "confidence": 60,
            "confidenceLevel": "medium",
            "methodology": "Skill is matched (case-insensitive) against the cohort's skillScarcity, skillTrajectory, skillPremiums, and topSkills outputs. No external benchmark or hire-outcome data is used.",
            "caveats": [
                "This is a market-positioning estimate, not a hire/job-acquisition forecast.",
                "Skill demand changes over time; re-run before acting on this estimate.",
                "Premium percentages are sample-size gated (≥5 listings); skills below that threshold return null premium."
            ],
            "recommendation": "Adding \"Rust\" aligns with a high-leverage position: emerging stage with scarcity score 78/100, +14.2% salary premium.",
            "sensitivity": null
        }
    ],
    "decisionTension": [
        {
            "between": ["increase_salary_band", "tighten_role_specs"],
            "tension": "cost_vs_selectivity",
            "explanation": "Raising salary improves candidate positioning, while tightening role specs reduces the eligible pool. Doing both at once may produce a small, expensive hire pipeline that misses both levers individually.",
            "recommendedBalance": "In tight markets prioritise the salary increase first; defer spec tightening unless inbound pipeline volume becomes excessive."
        }
    ],
    "rejectedActions": [
        {
            "action": "decrease_salary_band",
            "reason": "Market is tight (score 72/100). Lowering salary would reduce competitiveness against a pipeline that already favours employers raising bands. Not recommended."
        },
        {
            "action": "expand_geographic_search",
            "reason": "82.8% of listings are remote — geographic expansion adds no opportunity coverage when the market is location-agnostic. Use remote-first sourcing instead."
        },
        {
            "action": "hold_strategy",
            "reason": "Market regime is expansion with confidence 78/100 — there is a clear directional edge. Doing nothing is not the right read for this cohort."
        }
    ],
    "marketMemory": {
        "regimeHistory": [
            { "regime": "expansion", "at": "2026-04-04T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-04-11T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-04-18T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-04-25T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-05-02T14:32:00.000Z" }
        ],
        "regimeStability": 1.0,
        "lastInflectionDaysAgo": null,
        "pattern": "expansion_stable",
        "note": "Pattern derived from the last 5 regime classifications (capped at 12)."
    },
    "analysisMetadata": {
        "salarySampleSize": 34,
        "segmentCount": 0,
        "historicalTrackingEnabled": true,
        "incrementalApplied": false,
        "customSkillCount": 0,
        "sourceWeightsApplied": false,
        "sourcesQueried": 4,
        "sourcesSucceeded": 4,
        "mode": "default"
    },
    "warnings": [
        "82.8% of listings are remote — on-site benchmarks under-represented.",
        "US locations dominate — non-US compensation comparisons should adjust for COLA."
    ]
}

Each subsequent item is a normalized job listing:

{
    "type": "job",
    "source": "remotive",
    "title": "Senior Data Engineer",
    "company": "Snowflake",
    "location": "Worldwide",
    "remote": true,
    "jobType": "full-time",
    "salaryMin": 160000,
    "salaryMax": 210000,
    "salaryCurrency": "USD",
    "description": "We are looking for a Senior Data Engineer to build and maintain our core data platform...",
    "skills": ["Python", "SQL", "Spark", "Kafka", "Airflow", "AWS", "Docker", "Kubernetes"],
    "tags": ["data", "engineering", "big-data"],
    "postedDate": "2026-05-02T08:00:00.000Z",
    "url": "https://remotive.com/remote-jobs/software-dev/senior-data-engineer-12345",
    "applyUrl": "https://remotive.com/remote-jobs/software-dev/senior-data-engineer-12345",
    "seniorityLevel": "senior",
    "experienceYearsMin": 5,
    "experienceYearsMax": 8,
    "degreeRequired": "bachelors",
    "degreeIsHardRequirement": false,
    "skillCategoryProfile": "Data",
    "crossSourceConfirmed": true,
    "crossSourceCount": 2,
    "compensationTier": "above-market",
    "recommendedAction": "apply-now",
    "actionReason": "Above-market compensation tier (110–135% of market median) with disclosed salary at a named company.",
    "recordType": "job"
}

Output Fields — Summary Report

Field	Type	Description
`type`	string	Always `"summary"` for the report record
`query`	string	The search query used
`location`	string\|null	Location filter applied (if any)
`analyzedAt`	string	ISO timestamp of when the analysis ran
`totalListings`	number	Total deduplicated job listings found
`sourceBreakdown`	object	Count of listings per source (e.g., `{"remotive": 24, "arbeitnow": 31}`)
`topSkills`	array	Top 30 skills ranked by frequency, each with `skill`, `count`, and `percentage`
`salaryInsights`	object\|null	Salary statistics: `dataPoints`, `minSalary`, `maxSalary`, `medianSalary`, `averageSalary`, `currency`, plus `percentiles` (`p10`/`p25`/`p50`/`p75`/`p90`) when ≥5 data points
`skillPremiums`	array	Per-skill median salary lift vs cohort median, each with `skill`, `sampleSize`, `medianSalary`, `premiumVsMarket`, `premiumPercent` (only skills with ≥5 salary data points)
`topHiringCompanies`	array	Top 20 companies by number of open positions, each with `company` and `openings`
`jobTypeBreakdown`	object	Count per job type: `full-time`, `part-time`, `contract`, `internship`, `temporary`, `unknown`
`remotePercentage`	number	Percentage of listings flagged as remote
`seniorityBreakdown`	object	Percentage of listings at each seniority level: `intern`, `junior`, `mid`, `senior`, `staff`, `principal`, `lead`, `manager`, `director`, `vp-or-above`, `unknown`
`experienceRequirements`	object	`averageYearsMin`, `averageYearsMax`, `requireExperiencePercent`, `sampleSize`
`degreeRequirements`	object	`bachelorsRequiredPercent`, `mastersOrAbovePercent`, `noDegreeMentionedPercent`, `hardRequirementPercent`
`skillCategoryDemand`	object	Percentage of listings whose dominant skill area is each category: `Languages`, `Frameworks`, `Cloud`, `Data`, `AI/ML`, `Other`
`crossSourceOverlapCount`	number	Count of listings that appeared on multiple boards before deduplication (legitimacy signal)
`marketSnapshot`	string	Slack/email-ready one-line headline summarizing the cohort (metric-first)
`claim`	string	Analyst-style one-sentence conclusion about the cohort (paste verbatim into reports / Slack / agent prompts)
`confidenceScore`	number	0–100 score combining source coverage (30%) + cohort size (30%) + salary data depth (25%) + cross-source overlap (15%)
`confidenceLevel`	string	Banded confidence: `high` (≥75), `medium` (≥50), `low` (<50). Use this in Dify/n8n switch nodes.
`confidenceFactors`	string[]	Plain-English explanations of WHY confidence is what it is — usable verbatim in reports
`decisionReadiness`	string	Automation gate: `actionable` (confidence ≥70 + ≥10 salary points + ≥10 listings), `monitor` (worth tracking but don't auto-act), `insufficient-data` (<10 listings)
`dataQuality`	object	Auditability block: `salaryCoveragePercent`, `deduplicationConfidence` (high/medium/low), `sourceBias` ({remoteHeavy, europeSkew, usSkew, sourceConcentration, dominantSource}), `notes[]` plain-English bias warnings
`marketTightness`	object	Supply/demand index: `{ score (0–100), label: tight/balanced/loose/unknown, reason }`. Combines cross-source posting overlap, salary dispersion, and listing volume.
`skillScarcity`	object[]	Top 10 skills ranked by `scarcityScore` (high salary premium AND low frequency). Each: `{ skill, scarcityScore (0–100), frequencyPercent, premiumPercent, reason }`. Empty when cohort < 20 listings.
`salaryDistributionHealth`	string	`wide` (P10–P90 spread > 1.2× median) / `balanced` / `compressed` (< 0.5×) / `unknown`. Compressed = mature/standardised market.
`segments`	object[]	Per-segment analytics when `groupBy` is set. Each: `{ key, listings, medianSalary, salaryPercentiles, topSkills, seniorityBreakdown, remotePercentage, crossSourceConfirmedPercent }`. Capped at 50.
`trendInsights`	object\|null	Cross-run trends when `enableHistoricalTracking` is on AND a prior snapshot exists within `lookbackDays`. `{ sinceLastRun, previousRunAt, daysSincePreviousRun, listingGrowthRate, salaryMedianChange, salaryMedianChangePercent, remotePercentageChange, topRisingSkills[], topFallingSkills[], newCompanies[], departedCompanies[], direction }`. Null on first run.
`snapshotId`	string	16-char SHA-256 hash over query + location + sources + listing fingerprint. Compare across runs to detect when the cohort actually changed.
`schemaVersion`	string	Output contract version (semver-style) — currently `"2.1"`. Major bumps signal breaking changes; minor bumps signal additive expansions. 2.1 is additive-only since 2.0 (added: `actionClusters`, `whatIf` + `sensitivity`, `marketMemory`, `decisionTension`, `rejectedActions`, action `confidenceBreakdown`). Branch on this in long-lived integrations to opt into new features explicitly.
`runMode`	string	What kind of run this was: `snapshot` (one-shot), `historical` (snapshot + trend computation), `incremental` (snapshot + trend + drop already-seen URLs).
`baselineStatus`	string	Lifecycle of the historical snapshot for this run: `created` (first baseline written), `compared` (trend insights computed against an existing baseline), `expired` (prior baseline was older than `lookbackDays` — fresh one written, trends null this run), `disabled` (historical tracking off).
`analysisMetadata`	object	Run-level metadata about the analytics computation: `salarySampleSize`, `segmentCount`, `historicalTrackingEnabled`, `incrementalApplied`, `customSkillCount`, `sourceWeightsApplied`, `sourcesQueried`, `sourcesSucceeded`, `mode`. Distinct from `dataQuality` (which is about the cohort's biases, not the run's machinery).
`warnings`	string[]	Top-level run-level warnings (sources failed, low confidence, expired baseline, critical events, etc.). Promotes `dataQuality.notes` alongside other run-level signals so downstream consumers don't have to walk into nested objects. Empty array when nothing notable. Read this before acting on the cohort's analytics.
`mode`	string	Active persona preset: `default` / `job_seeker` / `recruiter` / `analyst`. Echoed on the summary so downstream automation can branch on the persona that produced the output.
`marketRegime`	object	State classification: `{ type (expansion/contraction/stagnation/volatility/unknown), confidence (0–100), signals[] (which thresholds fired), note }`. Combines trend + single-run signals; confidence is materially higher when historical tracking is on.
`recommendedActions`	object[]	Cohort-level action engine (capped at 12). Each: `{ action, target?, confidence (0–100), confidenceBreakdown: { dataStrength, signalClarity, historicalConsistency }, impact (high/medium/low), urgency (high/medium/low), appliesTo[] (hiring/recruiting/job-seeking/curriculum/strategy/monitoring), reason }`. Sorted by `mode` audience priority, then urgency, then confidence. Branch on `action` (stable enum string) for automation; filter by `appliesTo` to surface only the actions a given persona cares about. Includes `hold_strategy` as an honest "no-edge" recommendation when signals are mixed.
`actionClusters`	object[]	Recommended actions grouped by theme: `compensation_strategy`, `talent_pipeline`, `skill_strategy`, `monitoring_strategy`, `source_strategy`, `general`. Each: `{ theme, actions[], priority (high/medium/low), summary }`. Sorted high → low priority then by cluster size. Reduces noise when 8–12 actions belong to a few strategic surfaces.
`whatIf`	object[]	Counterfactual scenarios with honest, derivable-only outcomes (percentile shift, tier change, scarcity match) — never invented forecasts. Each: `{ scenario, input, effectiveness (strong/moderate/limited/none/unknown), predictedEffect, confidence (hard-capped at 60), confidenceLevel, methodology, caveats[], recommendation, sensitivity }`. `sensitivity` (salary scenarios only) ships `lowerOutcome`/`upperOutcome` at user-input ±5pp + a `stability` enum (`low` / `moderate` / `high` / `unknown`) so you can see if the percentile shift is robust to small input variation. Auto-generated when `whatIfScenarios` input is omitted; honors user scenarios + constraints when supplied. Scenario types: `salary_change` (% delta) and `skill_emphasis` (named skill).
`decisionTension`	object[]	Trade-off pairs detected across `recommendedActions[]`. Each: `{ between: [actionA, actionB], tension (cost_vs_selectivity / speed_vs_quality / remote_vs_local_reach / act_now_vs_wait / early_mover_vs_safe_bet / depth_vs_breadth), explanation, recommendedBalance }`. Surfaces when two recommended actions work against each other under a single sourcing pipeline. Empty when no contradictory pairs are present.
`rejectedActions`	object[]	Anti-recommendations — actions explicitly NOT recommended for this cohort, with reason. Each: `{ action, target?, reason }`. The dual of `hold_strategy`: instead of staying silent on the obvious wrong moves, the system surfaces them and explains why it skipped them. Builds trust by showing the engine considered alternatives. Empty when no anti-recommendations apply.
`marketMemory`	object	Bounded last-12-runs regime history with pattern detection. `{ regimeHistory[] (regime + at), regimeStability (0..1), lastInflectionDaysAgo, pattern, note }`. Patterns: `expansion_stable` / `expansion_weakening` / `contraction_stable` / `contraction_deepening` / `volatile_shifting` / `stagnation_persistent` / `inflection_recent` / `insufficient-history` (until 3 snapshots) / `mixed`. Activates with `enableHistoricalTracking`; meaningful at 3+ snapshots. Lets you reason in patterns, not just deltas.
`skillTrajectory`	object[]	Per-skill lifecycle classification (top 20 skills): `{ skill, stage (declining/stable/emerging/mainstream/saturated), velocity (hypergrowth/growing/steady/cooling/falling/unknown), frequencyPercent, premiumPercent, deltaPercent, confidence, reason }`. Sorted emerging → mainstream → other. The bridge between rising/falling counts and "what does it mean for me?"
`events`	object[]	Threshold-crossing events ready for downstream alerting. Each: `{ type, severity (critical/warning/info), thresholdCrossed, value, threshold, target?, message }`. Event types: `salary_spike`, `salary_drop`, `listing_growth_spike`, `listing_drop`, `remote_share_shift`, `skill_emergence`, `skill_collapse`, `new_companies_surge`, `cohort_collapse`. Thresholds user-overridable via the `eventThresholds` input. Sorted critical → warning → info.
`sourcesQueried`	number	Number of job board sources queried this run
`sourcesSucceeded`	number	Number of job board sources that returned data
`sourcesFailed`	string[]	Names of sources that failed this run; empty when all succeeded
`recordType`	string	Discriminator for downstream filtering — `summary` for the summary record, `job` for individual listings, `error` for error records. (`type` is a deprecated alias kept for back-compat.)

Output Fields — Job Listing

Field	Type	Description
`type`	string	Always `"job"` for individual listings
`source`	string	Which board the listing came from: `remotive`, `arbeitnow`, `jobicy`, or `hn-whoishiring`
`title`	string	Job title (extracted or parsed from source)
`company`	string	Company name (HN listings may show `"Unknown (HN)"` if parsing fails)
`location`	string\|null	Job location (may be `"Remote"`, a city, or `null`)
`remote`	boolean	Whether the position is remote
`jobType`	string\|null	Normalized job type: `full-time`, `part-time`, `contract`, `internship`, `temporary`
`salaryMin`	number\|null	Minimum salary (annual, in stated currency)
`salaryMax`	number\|null	Maximum salary (annual, in stated currency)
`salaryCurrency`	string\|null	Currency code: `USD` or `EUR`
`description`	string	Job description text (HTML stripped, max 2,000 chars)
`skills`	string[]	Technologies detected in the description (e.g., `["Python", "AWS", "Docker"]`)
`tags`	string[]	Tags from the source API (empty for HN listings)
`postedDate`	string\|null	ISO timestamp of when the job was posted
`url`	string	URL to the original listing
`applyUrl`	string\|null	Direct application URL (when available)
`seniorityLevel`	string	One of `intern`, `junior`, `mid`, `senior`, `staff`, `principal`, `lead`, `manager`, `director`, `vp-or-above`, `unknown`
`experienceYearsMin`	number\|null	Minimum years of experience requested (parsed from description)
`experienceYearsMax`	number\|null	Maximum years of experience requested
`degreeRequired`	string	`bachelors`, `masters`, `phd`, `any-degree`, `no-mention`
`degreeIsHardRequirement`	boolean	True if the degree is required (vs preferred / equivalent experience accepted)
`skillCategoryProfile`	string\|null	Dominant skill area for this role: `Languages`, `Frameworks`, `Cloud`, `Data`, `AI/ML`, `Other`
`crossSourceConfirmed`	boolean	True if this listing appeared on multiple job boards before deduplication
`crossSourceCount`	number	Number of source boards this listing appeared on
`compensationTier`	string	Salary vs market median for this query: `below-market` (<85%), `at-market` (85–110%), `above-market` (110–135%), `premium` (>135%), `unknown` (no salary data)
`recommendedAction`	string	Decision enum for routing in Dify/n8n workflows: `apply-now`, `research-company`, `review-fit`, `skip-low-detail`
`actionReason`	string	Plain-English sentence explaining WHY `recommendedAction` is what it is — paste verbatim into Slack/email/agent prompts
`recordType`	string	Always `"job"` for listings (mirrors `type` for forward-compatibility with the standard Apify discriminator pattern)

Common workflows

One-shot market pulse (no schedule)

Run with no historical-tracking flags. Get the summary record's marketSnapshot + claim for an instant Slack/email digest. Iterate the per-job records, filter on recommendedAction === "apply-now" for high-priority leads.

Weekly salary trend monitoring (scheduled)

Set enableHistoricalTracking: true + lookbackDays: 14. Schedule weekly. Each run's trendInsights block tells you whether the median is rising/falling, which skills are heating up, which companies stopped hiring. Pipe into a Slack alert: if (trendInsights.salaryMedianChangePercent > 5) sendAlert(...).

Daily fresh-listings feed (scheduled, incremental)

enableHistoricalTracking: true + incremental: true. Schedule daily. Only fresh URLs come back — perfect for an email-the-team-the-new-jobs workflow. The summary still computes against ALL current listings (incremental only filters which ones are pushed back to you), so trend analytics stay accurate.

Cross-region salary comparison (single run)

groupBy: ["location"] returns per-location segments with their own salary percentiles, top skills, and seniority breakdown. Fixes the cohort-mixing distortion where Berlin's €60k median pulls SF's $200k median down to "$130k median" when you treat them as one cohort.

Talent pipeline monitor for a single company

companyName: "Stripe" + enableHistoricalTracking: true. Schedule weekly. trendInsights.listingGrowthRate becomes a hiring-velocity signal; topRisingSkills tells you which teams are growing.

Niche-market intelligence (custom skills)

Add customSkills for the technologies your competitive landscape cares about that the built-in 80 don't cover (e.g. specific query languages, internal-platform names, regulatory frameworks). Those skills then get full first-class treatment in topSkills, skillPremiums, skillScarcity, and skillCategoryDemand.

What makes this actor different (vs other job market analysis tools)

This actor is an alternative to LinkedIn Talent Insights, Lightcast (formerly Burning Glass), Revelio Labs, Datapeople, Greenhouse Reports, Ashby Analytics, generic job scrapers and job aggregators — but built for automation workflows rather than dashboards or sales-team consumption.

Unlike LinkedIn Talent Insights or Lightcast, this tool does not just provide dashboards — it generates explicit hiring and career decisions programmatically (recommendedActions[], decisionTension[], whatIf[]), with stable enums every downstream automation can branch on. The output is decisions, not visualisations.

Approach	What you get	What's missing
Generic job board scraper (single-source)	Raw listings	No skill extraction, no salary stats, no decision layer, no cross-board overlap signal
LinkedIn / Indeed / Glassdoor scrapers	Larger volume	No multi-source aggregation; auth-walled; high block risk; flat output
Lightcast / Revelio / LinkedIn Talent Insights (enterprise)	Macro labor data, employee-level intel	$$$$ and behind sales-call paywalls; not embeddable in your automation
Job Market Intelligence (this actor)	Decision-ready output (`recommendedAction`, `compensationTier`, `decisionReadiness`); cohort analytics (percentiles, premiums, market tightness, scarcity); per-segment breakdowns; cross-run trend insights; data-quality auditability; trade-off detection (`decisionTension`); anti-recommendations (`rejectedActions`); counterfactual simulation (`whatIf` with sensitivity)	Public-API coverage only (Remotive / Arbeitnow / Jobicy / HN); no LinkedIn / Indeed / Glassdoor; no candidate-side data

The positioning is composable labor-market strategy engine for automation: stable enums on every record so Dify / n8n / Zapier / SQL can branch without prompt engineering, plus the cohort-level analytics and trend layers that turn one-shot scrapes into a monitoring product, plus the strategy layer (recommended actions / trade-offs / what-if scenarios) that turns analytics into decisions.

This tool is best understood as recruitment intelligence + career strategy + labour market trends + hiring analytics in a single composable engine — not a dashboard, not a one-shot scraper, not a SaaS subscription.

Use Cases

Job seekers — Search for roles matching your skills, compare salary ranges across companies, and discover which technologies are most in-demand for your target position
Recruiters and talent acquisition teams — Monitor competitor hiring activity, understand which skills the market demands, and benchmark compensation packages before writing job descriptions
HR and workforce planning analysts — Track hiring trends over time by scheduling periodic runs to build a longitudinal dataset of skill demand and salary movement
Career coaches and bootcamp instructors — Identify the most requested programming languages, frameworks, and cloud platforms so you can align curriculum with real employer needs
Startup founders — Research the talent landscape before hiring. See what competitors pay, which skills are scarce, and whether remote or on-site roles dominate your niche
Data journalists and researchers — Gather structured, source-attributed job market data for articles, reports, or academic studies on labor economics and tech hiring

API & Programmatic Access

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/job-market-intelligence").call(run_input={
    "query": "data engineer",
    "remoteOnly": True,
    "analyzeSkills": True,
    "analyzeSalaries": True,
    "maxResults": 200,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item["type"] == "summary":
        print(f"Total listings: {item['totalListings']}")
        print(f"Remote %: {item['remotePercentage']}%")
        if item.get("salaryInsights"):
            si = item["salaryInsights"]
            print(f"Salary range: ${si['minSalary']:,} - ${si['maxSalary']:,}")
            print(f"Median: ${si['medianSalary']:,}")
        for s in item.get("topSkills", [])[:10]:
            print(f"  {s['skill']}: {s['count']} ({s['percentage']}%)")
    else:
        print(f"{item['company']} - {item['title']} ({item['source']})")

JavaScript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('ryanclinton/job-market-intelligence').call({
    query: 'data engineer',
    remoteOnly: true,
    analyzeSkills: true,
    analyzeSalaries: true,
    maxResults: 200,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
const summary = items.find(i => i.type === 'summary');
const jobs = items.filter(i => i.type === 'job');

console.log(`Found ${summary.totalListings} listings, ${summary.remotePercentage}% remote`);
console.log('Top skills:', summary.topSkills.slice(0, 5).map(s => s.skill).join(', '));
jobs.forEach(j => console.log(`${j.company} - ${j.title} (${j.source})`));

cURL

# Start the actor
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~job-market-intelligence/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "data engineer",
    "remoteOnly": true,
    "analyzeSkills": true,
    "maxResults": 200
  }'

# Fetch results (use defaultDatasetId from the response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How It Works — Technical Details

Input: query, location, remoteOnly, datePosted, sources, maxResults
  │
  ▼
┌──────────────────────────────────────────────────────────────────┐
│ PARALLEL FETCH (Promise.allSettled — failures don't crash run)  │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  ┌─────────┐ │
│  │ Remotive     │  │ Arbeitnow    │  │ Jobicy   │  │ HN      │ │
│  │              │  │              │  │          │  │ Algolia │ │
│  │ GET /api/    │  │ GET /api/    │  │ GET /api │  │ GET /api│ │
│  │ remote-jobs  │  │ job-board-api│  │ /v2/     │  │ /v1/    │ │
│  │ ?search=X    │  │ ?search=X    │  │ remote-  │  │ search  │ │
│  │ &limit=N     │  │ &page=1..3   │  │ jobs     │  │ ?query= │ │
│  │              │  │              │  │ ?count=N │  │ X&tags= │ │
│  │ Salary from  │  │ Salary from  │  │ &tag=X   │  │ comment │ │
│  │ field +      │  │ description  │  │          │  │ ,ask_hn │ │
│  │ description  │  │ regex        │  │ Salary   │  │         │ │
│  │ fallback     │  │              │  │ from API │  │ Last    │ │
│  │              │  │ created_at   │  │ fields   │  │ 90 days │ │
│  │ Remote-only  │  │ = Unix epoch │  │          │  │         │ │
│  │ board        │  │              │  │ Remote-  │  │ Parse:  │ │
│  │              │  │ European     │  │ only     │  │ company │ │
│  │              │  │ focus        │  │ board    │  │ from 1st│ │
│  │              │  │              │  │          │  │ line    │ │
│  └──────┬───────┘  └──────┬───────┘  └────┬─────┘  └────┬────┘ │
│         │                 │               │              │      │
└─────────┼─────────────────┼───────────────┼──────────────┼──────┘
          │                 │               │              │
          ▼                 ▼               ▼              ▼
    ┌─────────────────────────────────────────────────────────┐
    │ NORMALIZE to NormalizedJob schema                       │
    │ (title, company, location, remote, salary, skills...)   │
    │                                                         │
    │ Skills: 80+ regex patterns across 6 categories          │
    │ (extensible via customSkills input)                     │
    │ Salary: USD/EUR regex from fields + description text    │
    │ Job type: normalize → full-time/part-time/contract/etc  │
    │ Description: strip HTML, max 2,000 chars                │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────────────┐
    │ FILTER PIPELINE (sequential)                            │
    │                                                         │
    │  1. Date filter (day=24h, week=7d, month=30d)           │
    │  2. Remote-only filter (j.remote === true)              │
    │  3. Location filter (case-insensitive substring)        │
    │     └─ Graceful fallback: if ALL removed, re-include    │
    │  4. Company name filter (case-insensitive substring)    │
    │  5. Source weighting (deterministic per-listing hash)   │
    │     └─ Only applied when sourceWeights is set           │
    │  6. Incremental drop (URLs from prior snapshot)         │
    │     └─ Only applied when incremental: true + baseline   │
    │  7. Deduplication (normalized title + URL secondary)    │
    │     ├─ Title: lowercase, strip noise tokens, sort       │
    │     ├─ URL: hostname + pathname secondary key           │
    │     └─ Tracks crossSourceCount per dedup key            │
    │  8. Cap at maxResults                                   │
    │  9. Compute market median (single salary pass)          │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────────────┐
    │ PER-JOB ENRICHMENT                                      │
    │                                                         │
    │  • seniorityLevel (regex over title + first 400 chars)  │
    │  • experienceYearsMin/Max (regex on description)        │
    │  • degreeRequired + degreeIsHardRequirement             │
    │  • skillCategoryProfile (dominant skill area)           │
    │  • crossSourceConfirmed + crossSourceCount              │
    │  • compensationTier (vs market median)                  │
    │  • recommendedAction + actionReason (decision enum)     │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────────────┐
    │ BUILD SUMMARY REPORT                                    │
    │                                                         │
    │  • Source breakdown + sourcesQueried/Succeeded/Failed   │
    │  • Top 30 skills by frequency + percentage              │
    │  • Salary: min, max, median, average + P10/25/50/75/90  │
    │  • Skill premiums (≥5 sample) vs cohort median          │
    │  • Top 20 hiring companies by openings                  │
    │  • Job type breakdown                                   │
    │  • Remote percentage                                    │
    │  • Seniority / experience / degree breakdowns           │
    │  • Skill category demand (% per category)               │
    │  • Cross-source overlap count                           │
    │  • marketTightness + skillScarcity + distribution health│
    │  • Per-segment analytics (when groupBy is set)          │
    │  • dataQuality + warnings + analysisMetadata            │
    │  • marketSnapshot + claim (Slack/email-ready)           │
    │  • snapshotId (cohort fingerprint)                      │
    │  • runMode + baselineStatus + schemaVersion             │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
            ┌─────────────────────────────────┐
            │ HISTORICAL SNAPSHOT (opt-in)    │
            │                                 │
            │  enableHistoricalTracking: true │
            │   ├─ Read prior snapshot from   │
            │   │  named KV store             │
            │   ├─ Compute trendInsights      │
            │   │  (rising/falling skills,    │
            │   │  salary direction, growth)  │
            │   └─ Write fresh snapshot       │
            └─────────────────┬───────────────┘
                              │
                              ▼
              Push to Dataset:
              [summary, ...jobs]
              + Actor.setValue('SUMMARY', summary)

Data Source Details

Source	API Endpoint	Coverage	Salary Data	Notes
Remotive	`remotive.com/api/remote-jobs`	Remote tech jobs worldwide	Structured field + description regex	Single page, `?search=X&limit=N`
Arbeitnow	`arbeitnow.com/api/job-board-api`	European focus, all job types	Description regex only	Paginated up to 3 pages, `created_at` is Unix timestamp
Jobicy	`jobicy.com/api/v2/remote-jobs`	Remote-first jobs	Structured `annualSalaryMin/Max` fields	`?count=N&tag=X`
HN Who's Hiring	`hn.algolia.com/api/v1/search`	Startup jobs from monthly threads	Description regex only	Searches comments from last 90 days, parses company from first line

Skill Detection System

The actor scans each job description against 80+ built-in technology patterns organized into 6 categories. Add domain-specific skills via the customSkills input — they're treated as first-class members of the categorisation, premium, and scarcity systems.

Category	Skills Detected
Languages	Python, JavaScript, TypeScript, Java, Rust, C++, Ruby, PHP, Swift, Kotlin, Scala, SQL, R, Go
Frameworks	React, Angular, Vue, Next.js, Django, Flask, Spring, Rails, Laravel, FastAPI, Express, Node.js, Svelte, NestJS, .NET
Cloud	AWS, Azure, GCP, Docker, Kubernetes, Terraform, CI/CD, Jenkins, GitHub Actions, CloudFormation
Data	PostgreSQL, MongoDB, Redis, Elasticsearch, Kafka, Spark, Snowflake, BigQuery, Airflow, MySQL, DynamoDB, Cassandra, Redshift
AI/ML	Machine Learning, Deep Learning, NLP, Computer Vision, PyTorch, TensorFlow, LLM, GPT, RAG, Generative AI, Neural Network
Other	Git, Linux, Agile, REST, GraphQL, gRPC, Microservices, Scrum, DevOps, SRE

Special handling: R and Go use context-aware regex to avoid false positives (e.g., "R" only matches when near "programming", "language", or other languages; "Go" matches "Golang" or "Go" in programming context).

Salary Extraction

Salary parsing uses multiple regex patterns applied to both structured API fields and free-text descriptions:

Pattern	Example	Currency
`$Xk - $Xk`	$120k - $180k	USD
`$X,XXX - $X,XXX`	$120,000 - $180,000	USD
`$Xk/year`	$150k/year	USD
`$X,XXX/year`	$150,000/year	USD
`€X - €X`	€50,000 - €80,000	EUR

Values under 1,000 are automatically multiplied by 1,000 (treating "150" as "$150k"). The summary report computes statistics from the sorted union of all min and max salary values.

Deduplication Algorithm

Two-phase deduplication for resilience against the same role posted across multiple boards with cosmetic title differences.

Title normalization — the title is lowercased, stripped of punctuation, and tokenized. Noise tokens (senior, sr, jr, mid, junior, staff, principal, lead, remote, fulltime, i, ii, iii, articles, prepositions) are removed so "Senior React Engineer" and "React Engineer (Sr)" collapse to the same key. Remaining tokens are alphabetised and capped at 80 characters.
Primary dedup key = company.toLowerCase().trim() + "::" + normalizedTitle.
URL secondary key = hostname + pathname from job.url. If the same URL has been seen under any primary key, the listing is folded into that key's crossSourceCount rather than re-counted.
The first listing encountered for each primary key is kept; subsequent duplicates increment crossSourceCount on the surviving record. crossSourceConfirmed: true fires when count > 1.

The two-phase approach catches both (a) the same role with cosmetic title variants and (b) the exact same URL re-syndicated to multiple boards.

HN Who's Hiring Comment Parsing

Hacker News comments are unstructured text. The actor extracts structured data via:

Company: Regex on first line: ^([A-Z][A-Za-z0-9\s&.'-]+?)[\s]*[|(\-–]/ (expects "Company | Role" format)
Role: Matches patterns like "hiring/looking for/seeking X" or "Company | X"
Remote: Word boundary match for /\bremote\b/i
Location: Matches "location/based in/office in: X"
Minimum length: Comments under 50 characters are skipped

How Much Does It Cost?

The Job Market Intelligence actor uses minimal compute resources because it calls lightweight REST APIs rather than rendering web pages. No proxies are required.

The actor is billed pay-per-event: one report-generated charge per successful run regardless of result count, source count, or whether segmentation / historical tracking / incremental mode are enabled. Apify platform compute is billed separately at standard rates and depends on memory and runtime — runs typically complete in well under a minute, and the actor's defaults (512 MB) keep platform compute modest. A scheduled daily run for monitoring is significantly cheaper than running ad-hoc scrapes against multiple sources individually.

The exact PPE price for the report-generated event is shown in the Apify Store listing and logged at the start of every run.

Default memory is 512 MB and most runs complete in well under a minute, so platform compute is a small additional charge on top of the report-generated event.

Tips

Start broad, then filter — Run a general query like "engineer" first to see the full landscape, then narrow with location or company filters in subsequent runs.
Combine sources strategically — Remotive and Jobicy focus on remote roles, Arbeitnow covers European markets heavily, and HN Who's Hiring surfaces startup opportunities. Use the sources parameter to target specific ecosystems.
Schedule weekly runs to build a time-series dataset of skill demand trends. Export to Google Sheets and chart how Python vs. Rust demand changes month over month.
Use maxResults: 500 for comprehensive market reports, or keep it at 50 for quick daily pulse checks.
Filter by company name to monitor a specific competitor's hiring velocity — a sudden spike in open roles often signals a new product launch or funding round.
Disable salary or skill analysis with the toggle fields if you only need raw listings. This slightly reduces processing time for very large result sets.

This is NOT for you if

Skip this actor if any of these describe you — there's a better tool for your job:

You only want raw job listings with no analytics layer → use a basic single-source scraper
You need LinkedIn, Indeed, or Glassdoor data specifically → use a dedicated scraper for that platform; those sites are auth-walled and explicitly out of scope here
You're not making decisions from job market data → if you just want to display listings to end-users, the decision-engine layer is overhead you won't use
You need real-time / streaming hiring velocity (sub-hour) → snapshots are per-run, not streaming. The minimum cadence is "as often as you schedule the actor"
You need candidate-side data (LinkedIn profiles, resumes, talent pools) → this is a supply-side actor (job postings); it doesn't model the candidate pool
You need to auto-apply / auto-submit applications → out of scope and against most boards' ToS
You need salary parsing in GBP / CAD / AUD / JPY → only USD and EUR salary patterns are recognised; other currencies pass through unparsed in description

What this actor does NOT do

Honest scope so you don't buy the wrong tool:

Need	Use this instead
LinkedIn / Indeed / Glassdoor coverage	Dedicated single-source scrapers — those platforms require auth and anti-bot handling that this actor explicitly does not do
Glassdoor company review / sentiment / rating enrichment	A separate Glassdoor scraper — joining is a downstream task
Layoff cross-reference (layoffs.fyi)	A separate layoff-tracker actor — keeps this actor's PPE economics simple
Candidate-side data (LinkedIn profiles, resumes, talent pools)	Out of scope — this actor returns the supply side (job postings), not the demand side
Auto-applying / auto-submitting applications	Out of scope and against most boards' ToS
GBP / CAD / AUD / JPY salary parsing	Only USD and EUR salary patterns are recognized; other currencies pass through unparsed in `description`
Real-time hiring-velocity tracking	Schedule the actor with `enableHistoricalTracking: true` — `trendInsights` gives you listing-growth-rate, salary direction, rising/falling skills, new vs departed companies on every subsequent run. Sub-hour velocity isn't supported (snapshots are per-run, not streaming).

The actor's positioning: composable job market intelligence for automation — the cleanest, fastest "what does the public-API job market look like for X right now, AND how is it shifting?" with decision-ready enums on every record and trend insights on every scheduled run. If you need enterprise-grade hiring intelligence (Lightcast, Revelio Labs, LinkedIn Talent Insights), this isn't a replacement — but at <$1/run it's the right starting point for most automation, research, and alerting workflows.

Limitations

Source coverage — Only four job boards are queried. Major platforms like LinkedIn, Indeed, and Glassdoor are not included due to their authentication requirements and anti-bot measures.
Salary data availability — Not all listings include salary information. The salary statistics are based only on listings that provide parseable salary data, which may skew toward certain markets or seniority levels.
Currency support — Only USD ($) and EUR (€) salary patterns are recognized. Salaries in GBP, CAD, AUD, or other currencies will not be extracted into structured salary fields.
Skill detection scope — The 80+ built-in skill patterns are tuned for technology roles. Non-tech skills (e.g., "project management", "sales") are not tracked. False positives are possible for ambiguous terms. Use the customSkills input to add domain-specific terms.
HN comment parsing — Hacker News "Who's Hiring" comments are free-form text. Company name, role, and location extraction is best-effort via regex and may produce incorrect results for non-standard formats.
No direct application — The actor collects listing URLs but does not submit job applications on your behalf.
Real-time freshness — Data comes from live API calls, but the underlying job boards may have their own delays in indexing new postings.
Deduplication limits — The deduplication key uses company name + first 60 characters of the title. Listings with slightly different titles for the same role may not be caught.

Responsible Use

This actor accesses only publicly available job board APIs that are designed for programmatic access. It does not bypass authentication, scrape private data, or violate any terms of service. When using job market data:

Use data for legitimate research, job seeking, or workforce planning purposes
Do not use automated data to discriminate against job seekers or companies
Respect the intellectual property of job descriptions and company information
Comply with all applicable employment and data protection laws in your jurisdiction
See Apify's guide on web scraping legality for general guidance

FAQ

Do I need any API keys to use this actor? No. All four data sources (Remotive, Arbeitnow, Jobicy, HN Algolia) are free public APIs. No authentication is required.

How many jobs can I get per run? The actor can return up to 500 listings per run. The actual count depends on how many matches exist for your query across all four sources.

Does this actor work for non-tech jobs? Yes. While the skill extraction is tuned for technology roles, the job search itself works for any keyword — "marketing manager", "nurse", "accountant", or any other role. The skill analysis will simply return fewer matches for non-tech positions.

How fresh is the data? Listing data is fetched live at run time. Use the datePosted filter to restrict results to the last 24 hours, week, or month. Historical snapshots (used for trendInsights and incremental mode) are only stored when enableHistoricalTracking: true is enabled — and even then, only a bounded summary record per query (top skills counts, companies, seen URLs) is persisted, not the raw listings.

Can I filter for a specific country or city? Yes. Enter the location in the location field (e.g., "Germany", "London", "USA"). The actor performs a case-insensitive substring match against each listing's location field. If the filter removes all results, the actor gracefully falls back to inc

Job Market Intel

allanjblythe/job-market-intel

Comprehensive job market analysis tracking salary trends, skill demand, and hiring patterns across major platforms. Get real-time market intelligence for strategic hiring, compensation planning, and career development decisions.

Allen Blythe

Remote Jobs Scraper — Aggregate Multiple Job Boards

lanky_quantifier/remote-jobs-aggregator

Aggregate remote job listings from WeWorkRemotely, Remotive, and Himalayas in one API call. Returns job title, company, salary, skills, location, and apply URL. 12 job categories. ~500 jobs for $2.50. No login needed.

Vhub Systems

Job Market Analyzer — Multi-Platform Salary & Demand

sovereigntaylor/job-market-analyzer

Meta-actor that searches 5 job platforms simultaneously. Provides unified listings with normalized salaries, skills extraction, remote/hybrid/onsite classification, experience level detection, and comprehensive market analytics including salary distributions and top hiring companies.

Ricardo Akiyoshi

💼 Remote Job Board Scraper

pixel_drafter/remote-job-board-scraper

Remote Job Board Scraper extracts remote job listings from public job boards using a headless browser. It collects job titles, company names, locations, and job URLs in structured JSON format. Ideal for job aggregators, alerts, analytics, and market research workflows.

Rohit Bhagat

Multi Site Remote Job Finder

sync-network/multi-site-remote-job-finder

Scrape remote job listings from Remote OK, Remotive, We Work Remotely, and Jobspresso. Filter by keywords, salary, and location. Extracts job title, company, salary, description, URL, and tags. Perfect for job seekers, recruiters, and developers building job aggregation tools.

Alam

We Work Remotely Scraper - Remote Job Listings

jp_data_tools/weworkremotely-scraper

Scrape remote job listings from WeWorkRemotely.com. Extract job titles, companies, descriptions, salaries, and application links across multiple categories.

Aoyuki Kurita

Thailand Job Market Intelligence

mai_amm/thailand-job-market-intelligence

Searches multiple Thailand job boards and summarizes demand, salaries, skills, hiring companies, locations, and remote/hybrid trends.

wiseld_squid

Remote.com Jobs Scraper 🌐🚀💼 - Cheap

scrapestorm/remote-com-jobs-scraper---cheap

Remote.com Jobs Scraper 🔍 allows you to extract detailed remote job listings with customizable filters. Gather insights on job titles, companies, salaries, locations & more perfect for recruitment research or market analysis! 🌍📊🧑‍💼 Ideal for tracking hiring trends in the remote work space

Storm_Scraper

Remote Jobs Scraper

hello.datawizards/Remote

Extract remote job listings with ease using keyword searches, employment types, and locations. Get structured JSON data with job titles, company details, and more for job market analysis. Use Apify Proxy to avoid blocks. Ideal for career tracking and remote work insights.

datawizards

Remote.com Job Scraper

saswave/remote-com-job-scraper

Remote.com Job Scraper. From a job search url on website remote.com , extract job listings. Get informations about title, date creation, location, salary, company website, company name and more

SASWAVE