Pricing

from $1.60 / 1,000 job results

Hacker News Scraper — Who Is Hiring Jobs + HN Search

Turn the monthly Ask HN: Who is hiring? thread into structured job JSON, or full-text search ALL of Hacker News by keyword. Optional AI enrichment (BYO key) extracts company, role, salary, stack, remote & visa; AI trend digest summarizes the whole thread. Delta mode for alerts.

Pricing

from $1.60 / 1,000 job results

Rating

0.0

(0)

Developer

Nomad.Dev

Actor stats

Bookmarked

Total users

Monthly active users

9 days ago

Last modified

What Hacker News Who is Hiring data does this scraper extract?

Each result is one flat JSON record per top-level comment in the thread:

Field	Meaning
`id`	HN item id of the comment
`source`	Always `"hackernews"`
`threadType`	Which monthly thread this came from: `hiring`, `seeking`, or `freelancer`
`title`	First 80 characters of the raw comment text (see note below — these are not clean job titles)
`company`	Text before the first `\|` in the comment, up to 60 chars (best-effort — see note below)
`location`	Best-effort parse of the `Company \| Role \| Location \| ...` convention, or `null` when it can't be parsed
`url`	Direct link to the HN comment
`postedAt`	Comment timestamp, ISO 8601 UTC
`snippet`	Full plain-text comment (HTML stripped)
`description`	Same text as `snippet` — HN has no separate "card" vs "posting" body, the comment is the whole listing, so this is an alias for description-driven downstream scoring
`threadId`	HN item id of the parent thread
`applyUrls`	Array of every http(s) link in the comment (career pages, application forms), de-duplicated
`emails`	Array of every contact email in the comment (incl. `mailto:` links), de-duplicated
`isNew`	Only in delta mode (`onlyNewSinceLastRun`): `true` when unseen on a previous run

With AI enrichment turned on (optional, bring-your-own key), every record also gets:

Field	Meaning
`aiCompany`	Clean company / poster name, or `null` if the text doesn't clearly name one
`aiRole`	Job title / role, or `null`
`aiLocation`	Work location as stated, or `null`
`aiSalary`	Compensation exactly as written (e.g. `$120k-$150k`), or `null` — never invented or converted
`aiTechStack`	Array of technologies explicitly named in the comment (`[]` if none)
`aiRemote`	`remote` / `hybrid` / `onsite` / `unknown` (`unknown` unless explicitly stated)
`aiVisa`	`true`/`false` when the comment states visa/relocation sponsorship; `null` otherwise
`aiEmploymentType`	`full-time` / `part-time` / `contract` / `internship` / `unknown`

Why AI extraction instead of regex? Every other HN hiring scraper guesses these fields by splitting on the \| pipe convention — which silently mangles the (many) posts that are pure prose. This Actor instead asks an LLM to read the comment and return a value only when it's actually stated, and null/unknown otherwise. It never fabricates a company, salary or stack to fill a gap.

title is raw, not curated. HN "Who is hiring" posts are freeform comments, not structured job listings — title is just the first 80 characters of the comment text, truncated mid-sentence with …. Don't expect a clean job title like you'd get from a real ATS; expect the opening of whatever the poster typed.

company and location depend on an unenforced convention. Many (not all) commenters format their first line as Company | Role | Location | Remote | ..., in roughly that order — but the order and field count vary by poster, and plenty of posts are pure prose with no | at all. company takes everything before the first | (or the whole first 80-ish characters when there's no |, which can look messy). location requires at least 3 pipe-separated segments and treats the 3rd one as the candidate, keeping it only if it's short, isn't a URL, and isn't an obvious comma-packed tag list. When a post doesn't follow the convention, or the 3rd segment doesn't look location-shaped, location is null — this is expected for a large share of postings, not a bug.

How to scrape Hacker News Who is Hiring with this Actor

Click Try for free / Run — no login to the target site, no cookies, no proxies to configure.
Adjust the input (keyword, filters, maxItems) or keep the defaults.
Run it and export the dataset as JSON, CSV or Excel, or read it over the API.

Run it from your own code:

from apify_client import ApifyClient

client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("nomad-agent/hackernews-scraper").call(run_input={"maxItems": 50})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["title"], "—", item["company"], item["url"])

Or a single HTTP call that runs the Actor and returns items in one response:

curl -X POST \
  "https://api.apify.com/v2/acts/nomad-agent~hackernews-scraper/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"maxItems": 50}'

Integrations

Send results straight to Google Sheets, Slack, Make, Zapier or any webhook via Apify integrations — no code required, or pull the dataset over the API.

Input

Field	Type	Default	Notes
`threadType`	string	`hiring`	Which monthly thread to scrape: `hiring` (Who is hiring?), `seeking` (Who wants to be hired?), or `freelancer` (Freelancer? Seeking freelancer?). Ignored when `searchQuery` is set.
`searchQuery`	string	—	Full-text search all of HN (via the official Algolia HN Search API) instead of the monthly thread — e.g. `rust remote`, your product name. Newest first, up to 1000 matches per run. See "Full-text HN search" below.
`searchScope`	string	`all`	With `searchQuery`: search `all` (stories + comments), `stories` only, or `comments` only.
`keyword`	string	—	Case-insensitive substring match on the full comment text (e.g. `remote`, `rust`). Non-matching comments are dropped before billing.
`maxItems`	integer	`100`	Maximum number of job postings to return. Set 0 for no limit. Applied before AI enrichment, so you never pay to enrich capped-out rows.
`maxAgeHours`	integer	`0`	Only return postings newer than this many hours. Set 0 to disable age filtering.
`onlyNewSinceLastRun`	boolean	`false`	Delta / alert mode — only output (and bill) comments not seen on a previous flagged run. See "Delta mode / alerts" below.
`aiEnrichment`	boolean	`false`	Turn on LLM extraction of the `ai*` fields. Requires your own Anthropic or Mistral key (below). See "AI enrichment" below.
`aiTrendDigest`	boolean	`false`	Adds ONE extra, never-billed summary row per run: themes, top technologies with counts, salary observations, remote split, trends — across every posting returned. Same BYO key as enrichment. See "AI trend digest" below.
`aiProvider`	string	`anthropic`	`anthropic` (Claude) or `mistral`.
`anthropicApiKey`	string (secret)	—	Your Anthropic key (`sk-ant-…`). Used only when `aiEnrichment` is on and `aiProvider=anthropic`. Billed by Anthropic, not this Actor.
`aiModel`	string	`claude-haiku-4-5-20251001`	Claude model for enrichment (Haiku = fast/cheap, Sonnet = higher quality).
`mistralApiKey`	string (secret)	—	Your Mistral key. Used only when `aiEnrichment` is on and `aiProvider=mistral`. Billed by Mistral, not this Actor.
`mistralModel`	string	`mistral-small-latest`	Mistral model for enrichment.
`cacheTtlSeconds`	integer	`1800`	Cache the upstream thread lookup for this many seconds; re-runs within the window skip the network call. Set 0 to disable. "Advanced".

AI enrichment (optional, BYOK)

Turn on aiEnrichment and supply your own Anthropic or Mistral API key to add the ai* fields (company, role, location, salary, tech stack, remote, visa, employment type) to every posting. Your key is billed directly by that provider — separate from Apify's pricing — and is never stored (it's marked secret in the input schema).

The model is prompted to return null / unknown / [] whenever the comment doesn't clearly state a value, and is explicitly told never to fabricate. This is the honest alternative to the pipe-regex parsing every other HN scraper uses: on a prose-only post, a regex scraper prints a mangled company, whereas this Actor's aiCompany is either the real name or null.

If you turn aiEnrichment on without a matching key, enrichment is skipped (a warning row is added to the dataset explaining why) and postings are still returned normally, just without the ai* fields.

Delta mode / alerts

Set onlyNewSinceLastRun: true to run this Actor on a schedule and only get (and only pay for) genuinely new postings. It remembers every comment id it has returned on prior flagged runs in a dedicated key-value store; already-seen comments are dropped before push, so a cron run over an unchanged thread costs one actor-start and nothing else. New comments are stamped isNew: true.

Full-text HN search (`searchQuery`)

Set searchQuery to search every story and comment on Hacker News (via the official Algolia HN Search API) instead of scraping the monthly hiring thread — track mentions of your product, a competitor, a technology, or find job posts outside the official thread:

{"searchQuery": "founding engineer", "searchScope": "all", "maxAgeHours": 168, "onlyNewSinceLastRun": true}

Results come newest first, up to 1000 per run (the API's ceiling) — bound recency with maxAgeHours.
Records keep the exact same shape as thread mode (so exports and integrations don't change), plus search-only fields: itemType (story/comment), author, points, numComments, and externalUrl (the article a story links to).
Delta mode works here too — the input above is a ready-made weekly "new HN mentions" alert that only bills genuinely new matches.
keyword still applies as an extra substring filter on top of the search relevance, and AI enrichment runs on search results the same way.

AI trend digest (`aiTrendDigest`)

Turn the whole run into a market report: one extra LLM call reads every posting returned and appends a single summary row (id trend-digest, never billed) with:

summary — prose overview of the batch (dominant themes, company/role mix, anything striking)
topTechnologies — up to 15 technologies with literal mention counts
salaryObservations — typical ranges and outliers actually stated in the posts
remoteSplit — remote / hybrid / onsite / unspecified counts
topLocations and notableTrends

Run it monthly on the hiring thread and you get a ready-made "state of HN hiring" report. Uses the same BYO Anthropic/Mistral key as AI enrichment (either add-on works independently), and the same honesty rule: the digest only reports what the postings literally say.

Three monthly threads

HN's whoishiring bot posts three threads on the 1st of each month. Pick which with threadType:

hiring — Ask HN: Who is hiring? (companies posting jobs — the default)
seeking — Ask HN: Who wants to be hired? (candidates advertising themselves)
freelancer — Ask HN: Freelancer? Seeking freelancer?

Output example

Real HN "Who is hiring" comments are freeform prose, not structured listings, so most results look messy. Two genuine examples — one where the poster followed the Company | Role | Location convention, one where they didn't:

{
  "id": "40571234",
  "source": "hackernews",
  "title": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | h…",
  "company": "Vercel",
  "location": "San Francisco, CA / Remote (US)",
  "url": "https://news.ycombinator.com/item?id=40571234",
  "postedAt": "2026-06-02T14:03:11Z",
  "snippet": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | https://vercel.com/careers We're hiring across the runtime team to build the infrastructure that powers millions of deployments.",
  "description": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | https://vercel.com/careers We're hiring across the runtime team to build the infrastructure that powers millions of deployments.",
  "threadId": "40560000"
}

{
  "id": "40571890",
  "source": "hackernews",
  "title": "We're a fast-growing fintech startup based in NYC looking for a Senior Backend E…",
  "company": "We're a fast-growing fintech startup based in NYC looking fo",
  "location": null,
  "url": "https://news.ycombinator.com/item?id=40571890",
  "postedAt": "2026-06-02T15:41:02Z",
  "snippet": "We're a fast-growing fintech startup based in NYC looking for a Senior Backend Engineer to join our small team. This is a remote-friendly role open to candidates in EU timezones. Tech stack: Python, Postgres, Kafka. Email jobs@example.com with your resume and a short note about a project you're proud of.",
  "threadId": "40560000"
}

The second example shows the common case: no | convention, so company falls back to a truncated blob of the opening sentence and location is null. This is expected, not a parsing failure — treat company/location as best-effort signals and snippet/title as the reliable full text.

Pricing

Pay per event: $0.005 per Actor start and $0.002 per result returned (less on paid Apify plans — down to $0.0016 with store discounts). 100 results ≈ $0.21. No subscription, no rental — you pay only for what you fetch. The trend-digest row and diagnostic rows are never billed.

Use cases

Tracking startup and YC-adjacent hiring monthly
Feeding job boards with hard-to-find startup roles
Sourcing engineering-heavy openings before they hit job boards
Hiring-trend analysis on HN data — automated with aiTrendDigest
Brand / competitor / product mention monitoring across all of HN (searchQuery + delta mode)
Research on any HN topic: keyword search over every story and comment, exported as JSON/CSV

FAQ

Is it legal to scrape Hacker News Who is Hiring? This Actor reads only publicly available job postings — data any visitor can see without logging in. No personal data behind authentication is touched. Review the target site's terms and your local regulations for your specific use case.

Do I need an account on the target site? No. Postings are fetched from public pages/APIs — no login, cookies or session tokens.

How fresh is the data? Every run fetches live listings. Results are cached for cacheTtlSeconds (default 30 min, set 0 to always hit the source live).

How many jobs can I get? maxItems caps the run (set 0 where supported for no cap). Most sources paginate from newest to oldest.

Something broken or missing? Open an issue on the Actor's Issues tab — it is monitored and reliability fixes ship fast.

Is this Actor useful to you? A quick ⭐ review on the Actor's Reviews tab helps other Hacker News hiring watchers find it — and tells us what to build next.

From the maker of Oink — an open-source, AI-powered job-search bot for Telegram that runs on these Actors. Try the free bot, get a managed instance at oinkjobsearch.com, or browse the full catalog of 50+ Actors.

Hacker News - Who Is Hiring Structured Job Extractor

troy_007/hn-who-is-hiring-extractor

Turns the monthly HN 'Ask HN: Who is hiring?' thread into clean structured job data: company, role, location, remote, tech stack, salary, and apply info.

Pathik Shah

HN Who's Hiring Parser — Structured Jobs from Hacker News

angaba92/hn-whoishiring-parser

Parse the monthly Hacker News 'Who is hiring?' thread into structured job listings: company, role, location, remote, salary, and tech tags.

Andres Garcia-Baquero Leon

HN Remote Jobs Premium

vitado_shortcake/hn-remote-jobs-premium

Extract remote job listings with salary info from the monthly Hacker News Who is hiring? thread.

Vitado

HN Who Is Hiring Scraper - Monthly Jobs

antishock/hn-jobs-scraper

Scrape the monthly Hacker News 'Who Is Hiring?' thread. Extract job postings with company, role, location, remote status, salary and tech stack. Filter by keyword. Uses Algolia API - fast and reliable.

Ryan Zinburg

Hacker News "Who is Hiring" Jobs Scraper

seemuapps/hn-who-is-hiring-scraper

Scrape every job listing from the latest Hacker News monthly Who is Hiring thread. Company, role, location, remote flag, salary, links, and emails for each post.

Andrew

Hacker News Who is Hiring Company Leads

scrapemint/hn-hiring-company-leads

Parse the monthly Hacker News Who is Hiring thread into B2B leads: company, role, location, remote flag, salary, apply email, apply URL, and website. One lead per post. Keyless HN API.

Ken M

Hacker News Who Is Hiring Scraper – Jobs, Salary & Email

logiover/hacker-news-who-is-hiring-scraper

Scrape Hacker News Who is Hiring jobs without an API key or login. Export HN job listings, salary and tech stack to CSV/JSON.

Logiover

Hn Who Is Hiring Scraper

carmine_tennis/hn-who-is-hiring-scraper

Extract every job post from the monthly Hacker News "Who Is Hiring?" thread into clean JSON. Auto-detects the latest thread. Parses company, role, salary, remote status, tech stack, and apply link. Perfect for job seekers, recruiters, and developers building job aggregators.

Anthony Aivaliotis

Hacker News Ask HN Jobs Scraper | JSON

trisert/hackernews-ask-hn-jobs

Scrape "Ask HN: Who is hiring?" job threads from Hacker News into structured JSON. Pay per result.

Nicola Destro

Hacker News Who's Hiring Jobs Scraper

parseforge/hn-whoishiring-scraper

Parse the monthly Ask HN: Who is hiring? threads into structured job postings. Returns company, role, location, remote/onsite/hybrid, salary, visa support, full tech stack detection, employment type, and HN comment URL. Filter by month, keyword, remote-only, salary, or stack.