# Hacker News Who Is Hiring Jobs Parser (`enviable_shell/hn-who-is-hiring-jobs-parser`) Actor

Parse monthly Ask HN: Who is hiring? threads into structured startup and tech job leads.

- **URL**: https://apify.com/enviable\_shell/hn-who-is-hiring-jobs-parser.md
- **Developed by:** [佳斌 王](https://apify.com/enviable_shell) (community)
- **Categories:** Jobs, Automation, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Hacker News Who Is Hiring Jobs Parser

Extract structured job leads from Hacker News monthly `Ask HN: Who is hiring?` threads.

This Apify Actor turns top-level hiring comments into spreadsheet-ready records with company, locations, remote/visa/intern flags, salary hints, roles, tech keywords, contact links, and the original Hacker News permalink.

### Who should use this

- Recruiters and sourcers looking for startup hiring leads
- Job seekers tracking remote-friendly engineering, product, data, design, and AI roles
- Talent marketplaces building a monthly HN job feed
- Market researchers tracking which startups are hiring and what stacks they use
- Indie hackers or newsletter operators curating tech job opportunities

### Why this Actor

Generic Hacker News scrapers return raw stories and comments. Recruiters, job seekers, and market researchers usually need one row per hiring post with normalized fields they can filter in CSV, Google Sheets, Airtable, or an ATS workflow.

### Best first run

Parse the latest monthly HN hiring thread:

```json
{
  "maxJobs": 500,
  "requireHiringSignal": true,
  "includeRawHtml": false
}
````

If you need a specific month, paste the HN thread URL into `threadUrl`.

### Scheduled monitoring examples

Use an Apify schedule shortly after the monthly thread appears to create a recurring hiring-intelligence workflow.

#### Monthly startup hiring feed

```json
{
  "maxJobs": 1000,
  "requireHiringSignal": true,
  "includeRawHtml": false
}
```

Suggested schedule: monthly. Export the dataset to CSV, Google Sheets, Airtable, Notion, or your own job board.

#### Remote engineering role watch

```json
{
  "maxJobs": 1000,
  "requireHiringSignal": true,
  "includeRawHtml": false
}
```

Suggested workflow: filter output rows where `remote=true`, then filter `roles` and `techKeywords` for your target stack.

#### Compensation and visa-signal research

```json
{
  "maxJobs": 2000,
  "requireHiringSignal": true,
  "includeRawHtml": false
}
```

Suggested workflow: filter rows where `compensationSignal=true`, `salaryText` is not empty, or `visa=true`.

### Input

```json
{
  "maxJobs": 200,
  "requireHiringSignal": true
}
```

Optional fields:

- `threadUrl`: parse a specific HN thread URL.
- `threadId`: parse a specific HN item ID.
- `includeRawHtml`: include the original HN HTML for custom parsing.

If neither `threadUrl` nor `threadId` is provided, the Actor auto-detects the latest monthly `Ask HN: Who is hiring?` thread using the public HN Algolia API.

### Output

Each row is one top-level hiring comment:

```json
{
  "threadId": 45123456,
  "commentId": 45124567,
  "hnUrl": "https://news.ycombinator.com/item?id=45124567",
  "company": "Example AI",
  "headline": "Example AI | Backend Engineer, Product Designer | Remote (US/EU) | Full-time",
  "locations": ["Remote", "US", "EU"],
  "remote": true,
  "onsite": false,
  "visa": false,
  "internship": false,
  "salaryText": "$120k-$180k",
  "compensationSignal": true,
  "roles": ["Backend Engineer", "Product Designer"],
  "techKeywords": ["python", "typescript", "postgres"],
  "contactUrls": ["https://example.com/careers"],
  "author": "hn_user",
  "createdAt": "2026-06-01T15:20:00.000Z",
  "text": "Plain text version of the post...",
  "source": "hacker-news-who-is-hiring",
  "scrapedAt": "2026-06-05T09:00:00.000Z"
}
```

### Use cases

- Build a monthly tech job lead list.
- Track which startups are hiring across HN.
- Filter remote-friendly roles by stack, region, salary, or visa signal.
- Feed fresh hiring posts into Google Sheets, Airtable, Notion, or an internal job board.

### Why this Actor instead of a generic HN scraper

Generic HN scrapers return raw comments. This Actor is tuned for the monthly `Who is hiring?` workflow:

- Detects the latest monthly hiring thread automatically
- Parses top-level hiring comments into one row per job lead
- Extracts company, role, location, remote, onsite, visa, internship, and salary signals
- Pulls out tech keywords for stack-based filtering
- Extracts contact and careers URLs
- Saves direct HN permalinks for each hiring post
- Produces spreadsheet-ready data for recruiting, job search, and market research

If you only need raw HN comments, use a generic scraper. If you want structured hiring leads, use this Actor.

### Practical workflow

1. Run after the monthly HN `Who is hiring?` thread is posted.
2. Keep `requireHiringSignal=true` for clean results.
3. Export the dataset to CSV or Google Sheets.
4. Filter by `remote`, `locations`, `roles`, `techKeywords`, `visa`, and `compensationSignal`.
5. Use `hnUrl` to inspect the original post before contacting a company.

### What you can automate next

- Monthly hiring lead CSV for recruiters or job seekers
- Remote-only startup job board
- Airtable base of companies hiring on HN
- Market map of startup hiring demand by tech stack
- Salary/visa signal tracker across monthly HN threads

### Notes

- Uses public Hacker News Firebase and Algolia APIs.
- No login, cookies, proxy, or browser automation required.
- Parses top-level comments only, because those are the actual job posts in the monthly thread.

# Actor input Schema

## `threadUrl` (type: `string`):

Optional Hacker News item URL for a specific Ask HN: Who is hiring? thread. If empty, the Actor auto-detects the latest monthly thread.

## `threadId` (type: `integer`):

Optional HN item ID. Overrides auto-detection when provided.

## `maxJobs` (type: `integer`):

Maximum number of top-level job posts to return.

## `includeRawHtml` (type: `boolean`):

Include the original HN comment HTML in each output row for debugging or custom downstream parsing.

## `requireHiringSignal` (type: `boolean`):

Skip comments that do not look like job postings. Turn off if you want every top-level comment.

## Actor input object example

```json
{
  "maxJobs": 500,
  "includeRawHtml": false,
  "requireHiringSignal": true
}
```

# Actor output Schema

## `jobs` (type: `string`):

One dataset row per top-level hiring comment, including company, roles, locations, remote/visa/salary signals, tech keywords, contact links, and Hacker News permalinks.

## `summary` (type: `string`):

Markdown summary with thread link, job count, remote count, visa signal count, and compensation signal count.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "threadUrl": ""
};

// Run the Actor and wait for it to finish
const run = await client.actor("enviable_shell/hn-who-is-hiring-jobs-parser").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "threadUrl": "" }

# Run the Actor and wait for it to finish
run = client.actor("enviable_shell/hn-who-is-hiring-jobs-parser").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "threadUrl": ""
}' |
apify call enviable_shell/hn-who-is-hiring-jobs-parser --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=enviable_shell/hn-who-is-hiring-jobs-parser",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hacker News Who Is Hiring Jobs Parser",
        "description": "Parse monthly Ask HN: Who is hiring? threads into structured startup and tech job leads.",
        "version": "0.1",
        "x-build-id": "IIzeZi1g5Kuskz3dS"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/enviable_shell~hn-who-is-hiring-jobs-parser/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-enviable_shell-hn-who-is-hiring-jobs-parser",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/enviable_shell~hn-who-is-hiring-jobs-parser/runs": {
            "post": {
                "operationId": "runs-sync-enviable_shell-hn-who-is-hiring-jobs-parser",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/enviable_shell~hn-who-is-hiring-jobs-parser/run-sync": {
            "post": {
                "operationId": "run-sync-enviable_shell-hn-who-is-hiring-jobs-parser",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "threadUrl": {
                        "title": "Thread URL",
                        "type": "string",
                        "description": "Optional Hacker News item URL for a specific Ask HN: Who is hiring? thread. If empty, the Actor auto-detects the latest monthly thread."
                    },
                    "threadId": {
                        "title": "Thread ID",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Optional HN item ID. Overrides auto-detection when provided."
                    },
                    "maxJobs": {
                        "title": "Maximum jobs",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum number of top-level job posts to return.",
                        "default": 500
                    },
                    "includeRawHtml": {
                        "title": "Include raw HTML",
                        "type": "boolean",
                        "description": "Include the original HN comment HTML in each output row for debugging or custom downstream parsing.",
                        "default": false
                    },
                    "requireHiringSignal": {
                        "title": "Require hiring signal",
                        "type": "boolean",
                        "description": "Skip comments that do not look like job postings. Turn off if you want every top-level comment.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
