# Job Posting Drift Intelligence Actor (`muhammad-bilal/job-posting-drift-intelligence-actor`) Actor

Job Posting Drift Intelligence monitors job listings over time and detects meaningful changes like salary updates, remote/onsite shifts, seniority inflation, and requirement changes. Turn static job posts into actionable job lifecycle intelligence

- **URL**: https://apify.com/muhammad-bilal/job-posting-drift-intelligence-actor.md
- **Developed by:** [Muhammad Bilal](https://apify.com/muhammad-bilal) (community)
- **Categories:** Jobs, AI, Automation
- **Stats:** 3 total users, 0 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $6.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🕵️ Job Posting Drift Intelligence Actor

Competition-grade Job Intelligence system for detecting and analyzing meaningful changes in job postings over time.

Apify SDK CheerioCrawler TypeScript

### 🎯 Overview
Job Posting Drift Intelligence Actor is a production-grade Apify Actor that monitors job postings, captures structured snapshots, and intelligently detects meaningful changes in job listings such as salary, remote status, seniority, and requirements. Built with enterprise security, scalability, and extensibility in mind.

### Key Capabilities
✅ **Structured Change Detection** - Salary, remote status, seniority, and requirements tracking
✅ **Intelligent Normalization** - Converts raw HTML to structured job data
✅ **Semantic Diff Engine** - Detects meaningful job posting changes
✅ **Optional AI Analysis** - LLM-powered change explanation (OpenAI-compatible)
✅ **Persistent Snapshots** - SHA-256 hashed storage with Apify Key-Value Store
✅ **Backward Compatible** - Works as simple scraper or advanced intelligence system
✅ **Cloud-Safe** - No hardcoded secrets, graceful failures, input validation

### 🚨 Why Job Posting Drift Intelligence Actor?
Job postings change silently — salary adjustments, remote policy updates, seniority requirements, or benefit changes often go unnoticed until they impact recruitment, compensation strategies, or competitive positioning.

Job Posting Drift Intelligence Actor automatically monitors job listings and detects:

📄 **Content changes** (description updates, requirement modifications)

💰 **Compensation changes** (salary range adjustments, benefit updates)

🏢 **Work arrangement changes** (remote/on-site policy shifts)

👥 **Role evolution** (seniority level changes, title modifications)

You get actionable job market intelligence, not raw HTML diffs.

### 🎯 Who is this for?
**Recruitment teams** monitoring competitor salary ranges and requirements

**HR departments** tracking job posting compliance and consistency

**Compensation analysts** monitoring market salary trends

**Talent acquisition teams** detecting hiring urgency signals

**Job platforms** providing drift intelligence to job seekers and employers

**Market researchers** analyzing job market dynamics and trends

### ⚙️ How it works (4 steps)
Provide one or more job posting URLs to monitor

Configure AI analysis and sensitivity settings

Run the Actor → receive structured job drift results

Each result includes change detection, AI insights, and metadata

### 💰 Pricing example (transparent)
Monitoring 100 job postings ≈ $0.15

Detecting changes across 100 postings ≈ $0.45

AI analysis for 50 changed postings ≈ $0.30

No monthly fees — pay only for what you use

### 🚀 Quick Start

#### Local Development
```bash
## Install dependencies
npm install

## Build TypeScript
npm run build

## Run Actor locally (preserves snapshots between runs)
npm start

## Or use Apify CLI (clears storage each run)
apify run
````

#### Apify Platform

```bash
## Login to Apify platform
apify login

## Push to Apify cloud
apify push
```

### Input Configuration

Create `.actor/input_schema.json` or use the Apify Console input form:

```json
{
  "jobUrls": "https://example.com/job1\nhttps://example.com/job2",
  "enableSemanticAnalysis": true,
  "runLabel": "weekly-monitoring"
}
```

### 📊 Output Format

Each monitored job posting produces structured JSON:

```json
{
  "url": "https://example.com/job1",
  "title": "Senior Software Engineer",
  "company": "Tech Corp",
  "changed": true,
  "salaryChanged": true,
  "remoteStatusChanged": false,
  "seniorityChanged": true,
  "requirementsChanged": true,
  "aiSummary": "Job title upgraded from Software Engineer to Senior Software Engineer with salary increase from $100k-120k to $120k-150k",
  "aiSeverity": "high",
  "aiCategory": "salary",
  "checkedAt": "2025-12-20T14:00:00.000Z"
}
```

#### Field Descriptions

| Field | Type | Description |
|-------|------|-------------|
| `url` | string | Job posting URL |
| `title` | string|null | Job title |
| `company` | string|null | Company name |
| `changed` | boolean | True if any meaningful changes detected |
| `salaryChanged` | boolean | True if salary range changed |
| `remoteStatusChanged` | boolean | True if remote status changed |
| `seniorityChanged` | boolean | True if seniority level changed |
| `requirementsChanged` | boolean | True if requirements changed |
| `aiSummary` | string|null | AI-generated change summary |
| `aiSeverity` | string|null | Change severity (low/medium/high) |
| `aiCategory` | string|null | Primary change category |
| `checkedAt` | string | ISO timestamp of check |

### ⚙️ Configuration Options

#### jobUrls (required)

Job posting URLs to monitor. One URL per line or comma-separated.

#### enableSemanticAnalysis (default: false)

Enable AI-powered change explanation using OpenAI. Requires `OPENAI_API_KEY` environment variable.

#### runLabel (optional)

Label for grouping related runs in the dataset.

### 🔒 Security & Best Practices

#### API Keys

Never hardcode API keys. Use environment variables:

```bash
## Local development
export OPENAI_API_KEY="sk-..."

## Apify platform
## Set in Actor → Settings → Environment Variables
```

#### Input Validation

All inputs are validated:

- URLs are normalized and validated
- AI analysis gracefully fails without API key
- Missing fields have safe defaults

#### Graceful Failures

- Missing API keys → Warning + null AI results
- Malformed HTML → Logged + continues processing
- Network errors → Retry mechanism with fallbacks
- Invalid URLs → Skipped with error logging

### 🏗️ Architecture

#### Core Components

```
src/
├── main.ts              # Orchestrates the entire workflow
├── crawler/             # HTML fetching and extraction
├── normalizer/          # Raw text → structured job data
├── diff/               # Change detection logic
├── intelligence/       # AI analysis integration
└── storage/            # Snapshot persistence
```

#### Storage Strategy

**Key-Value Store** (`job-snapshots`)

- Keys: `job-snapshot-{sha256hash}`
- Stores complete job snapshots
- Persistent across runs

**Dataset** (`default`)

- One record per job URL per run
- Structured JSON with change detection
- Queryable and exportable

### 🧪 Testing & Verification

#### Test Change Detection

```bash
## First run - establishes baseline
npm run full-test

## Check output shows no changes (first run)
## Output: "changed": false

## Modify test data to simulate changes
## Second run - detects changes
npm run full-test

## Output: "changed": true with specific change flags
```

#### Test AI Analysis

```bash
## Set API key
export OPENAI_API_KEY="sk-..."

## Run with AI enabled
npm run full-test

## Output includes aiSummary, aiSeverity, aiCategory
```

#### Test URL Processing

```bash
## Test newline-separated URLs
echo "https://example.com/job1
https://example.com/job2" > test-urls.txt

## Test comma-separated URLs
echo "https://example.com/job1, https://example.com/job2" > test-urls.txt
```

### 📈 Performance Characteristics

- **Memory**: ~30-50MB for 100 job postings
- **Speed**: ~20-40 pages/minute (network-dependent)
- **Storage**: ~2KB per job snapshot
- **Scalability**: Handles 1,000+ job postings efficiently
- **Cost**: ~$0.10 per 100 job checks

### 🔮 Future Enhancements

This Actor is designed as a foundational building block for:

- **Historical Trend Analysis** - Salary and requirement trends over time
- **Competitor Intelligence** - Cross-company compensation comparison
- **Hiring Urgency Detection** - Automated priority flagging
- **Multi-Platform Monitoring** - LinkedIn, Indeed, company career pages
- **Alert System** - Email/webhook notifications for critical changes
- **Custom Rules Engine** - XPath/CSS-based monitoring rules
- **Visual Diff** - Screenshot comparison for layout changes
- **Market Intelligence** - Aggregated job market insights

### 📚 Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify SDK](https://docs.apify.com/sdk/js/)
- [CheerioCrawler](https://crawlee.dev/docs/guides/cheerio-crawler)
- [Actor Store](https://apify.com/store)

### 🎓 Technical Notes

#### Why CheerioCrawler?

- Lightweight (no browser overhead)
- Fast HTML parsing
- Perfect for static job posting pages
- Cost-effective at scale

#### Why SHA-256 Hashing?

- Deterministic content fingerprinting
- Collision-resistant for data integrity
- Standard cryptographic security
- Fast computation for large datasets

#### Why Structured Normalization?

- Converts messy HTML to clean, comparable data
- Enables intelligent change detection
- Supports multiple job board formats
- Future-proof for new job platforms

#### Why Apify Key-Value Store?

- Persists between actor runs
- Enables historical comparison
- Cloud-compatible storage
- Automatic cleanup and management

### 📜 License

This Actor follows Apify's standard terms of service.

### 🤝 Contributing

This Actor was built with extensibility in mind. Key extension points:

- **Custom Normalizers** - Modify `normalizeJobData()` for new job formats
- **Alternative Diff Engines** - Extend `computeDiff()` with custom logic
- **Additional LLM Providers** - Replace OpenAI in `analyzeChanges()`
- **Custom Severity Scoring** - Update change detection thresholds
- **New Change Categories** - Add salary, benefits, location tracking

### 🏆 Competition-Grade Features

✅ **Deterministic output** - Same input always produces same results
✅ **Structured and readable** - Clean JSON with meaningful field names
✅ **No unnecessary dependencies** - Minimal, focused tech stack
✅ **Reusable foundation** - Extensible for various job monitoring needs
✅ **Code tells a story** - Self-documenting with clear abstractions
✅ **Production-ready** - Error handling, logging, validation
✅ **Judge-friendly demo mode** - Works without API keys
✅ **Extensive documentation** - Complete setup and usage guides

Built with ❤️ for the Apify ecosystem

# Actor input Schema

## `jobUrls` (type: `string`):

List of job posting URLs to monitor (one URL per line, e.g. https://example.com/job1
https://example.com/job2)

## `enableSemanticAnalysis` (type: `boolean`):

Enable AI-based explanation of job changes

## `runLabel` (type: `string`):

Optional label for grouping runs

## Actor input object example

```json
{
  "enableSemanticAnalysis": false
}
```

# Actor output Schema

## `url` (type: `string`):

No description

## `title` (type: `string`):

No description

## `company` (type: `string`):

No description

## `changed` (type: `string`):

No description

## `salaryChanged` (type: `string`):

No description

## `remoteStatusChanged` (type: `string`):

No description

## `seniorityChanged` (type: `string`):

No description

## `requirementsChanged` (type: `string`):

No description

## `aiSummary` (type: `string`):

No description

## `aiSeverity` (type: `string`):

No description

## `aiCategory` (type: `string`):

No description

## `checkedAt` (type: `string`):

No description

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("muhammad-bilal/job-posting-drift-intelligence-actor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("muhammad-bilal/job-posting-drift-intelligence-actor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call muhammad-bilal/job-posting-drift-intelligence-actor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=muhammad-bilal/job-posting-drift-intelligence-actor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Job Posting Drift Intelligence Actor",
        "description": "Job Posting Drift Intelligence monitors job listings over time and detects meaningful changes like salary updates, remote/onsite shifts, seniority inflation, and requirement changes. Turn static job posts into actionable job lifecycle intelligence",
        "version": "1.0",
        "x-build-id": "pvsIpQjfglv2Zp5xL"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/muhammad-bilal~job-posting-drift-intelligence-actor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-muhammad-bilal-job-posting-drift-intelligence-actor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/muhammad-bilal~job-posting-drift-intelligence-actor/runs": {
            "post": {
                "operationId": "runs-sync-muhammad-bilal-job-posting-drift-intelligence-actor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/muhammad-bilal~job-posting-drift-intelligence-actor/run-sync": {
            "post": {
                "operationId": "run-sync-muhammad-bilal-job-posting-drift-intelligence-actor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "jobUrls"
                ],
                "properties": {
                    "jobUrls": {
                        "title": "Job URLs",
                        "type": "string",
                        "description": "List of job posting URLs to monitor (one URL per line, e.g. https://example.com/job1\nhttps://example.com/job2)"
                    },
                    "enableSemanticAnalysis": {
                        "title": "Enable AI Analysis",
                        "type": "boolean",
                        "description": "Enable AI-based explanation of job changes",
                        "default": false
                    },
                    "runLabel": {
                        "title": "Run Label",
                        "type": "string",
                        "description": "Optional label for grouping runs"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
