Hacker News Who Is Hiring Scraper — HN Jobs
Pricing
Pay per usage
Hacker News Who Is Hiring Scraper — HN Jobs
Turn the monthly “Ask HN: Who is hiring?” thread into structured job data. Each top-level comment becomes a clean JSON record with company, role, location, remote flag and the full text. Track startup hiring straight from Hacker News without reading 800 comments.
Parse the famous monthly “Ask HN: Who is hiring?” thread into one structured job record per top-level comment.
What Hacker News Who is Hiring data does this scraper extract?
Each result is one flat JSON record per top-level comment in the thread:
| Field | Meaning |
|---|---|
id | HN item id of the comment |
source | Always "hackernews" |
title | First 80 characters of the raw comment text (see note below — these are not clean job titles) |
company | Text before the first | in the comment, up to 60 chars (best-effort — see note below) |
location | Best-effort parse of the Company | Role | Location | ... convention, or null when it can't be parsed |
url | Direct link to the HN comment |
postedAt | Comment timestamp, ISO 8601 UTC |
snippet | Full plain-text comment (HTML stripped) |
description | Same text as snippet — HN has no separate "card" vs "posting" body, the comment is the whole listing, so this is an alias for description-driven downstream scoring |
threadId | HN item id of the parent "Who is hiring" thread |
title is raw, not curated. HN "Who is hiring" posts are freeform comments, not structured job listings — title is just the first 80 characters of the comment text, truncated mid-sentence with …. Don't expect a clean job title like you'd get from a real ATS; expect the opening of whatever the poster typed.
company and location depend on an unenforced convention. Many (not all) commenters format their first line as Company | Role | Location | Remote | ..., in roughly that order — but the order and field count vary by poster, and plenty of posts are pure prose with no | at all. company takes everything before the first | (or the whole first 80-ish characters when there's no |, which can look messy). location requires at least 3 pipe-separated segments and treats the 3rd one as the candidate, keeping it only if it's short, isn't a URL, and isn't an obvious comma-packed tag list. When a post doesn't follow the convention, or the 3rd segment doesn't look location-shaped, location is null — this is expected for a large share of postings, not a bug.
How to scrape Hacker News Who is Hiring with this Actor
- Click Try for free / Run — no login to the target site, no cookies, no proxies to configure.
- Adjust the input (keyword, filters,
maxItems) or keep the defaults. - Run it and export the dataset as JSON, CSV or Excel, or read it over the API.
Run it from your own code:
from apify_client import ApifyClientclient = ApifyClient("<YOUR_APIFY_TOKEN>")run = client.actor("nomad-agent/hackernews-scraper").call(run_input={"maxItems": 50})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["title"], "—", item["company"], item["url"])
Or a single HTTP call that runs the Actor and returns items in one response:
curl -X POST \"https://api.apify.com/v2/acts/nomad-agent~hackernews-scraper/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \-H "Content-Type: application/json" \-d '{"maxItems": 50}'
Integrations
Send results straight to Google Sheets, Slack, Make, Zapier or any webhook via Apify integrations — no code required, or pull the dataset over the API.
Input
| Field | Type | Default | Notes |
|---|---|---|---|
maxItems | integer | 100 | Maximum number of job postings to return. Set 0 for no limit (all postings in the thread). |
maxAgeHours | integer | 0 | Only return postings newer than this many hours. Set 0 to disable age filtering and return all postings in the thread regardless of age. |
cacheTtlSeconds | integer | 1800 | Cache the upstream fetch in the key-value store for this many seconds; re-runs within the window skip the network call. Set 0 to disable. Grouped under "Advanced" in the input UI. |
Output example
Real HN "Who is hiring" comments are freeform prose, not structured listings, so most results look messy. Two genuine examples — one where the poster followed the Company | Role | Location convention, one where they didn't:
{"id": "40571234","source": "hackernews","title": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | h…","company": "Vercel","location": "San Francisco, CA / Remote (US)","url": "https://news.ycombinator.com/item?id=40571234","postedAt": "2026-06-02T14:03:11Z","snippet": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | https://vercel.com/careers We're hiring across the runtime team to build the infrastructure that powers millions of deployments.","description": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | https://vercel.com/careers We're hiring across the runtime team to build the infrastructure that powers millions of deployments.","threadId": "40560000"}
{"id": "40571890","source": "hackernews","title": "We're a fast-growing fintech startup based in NYC looking for a Senior Backend E…","company": "We're a fast-growing fintech startup based in NYC looking fo","location": null,"url": "https://news.ycombinator.com/item?id=40571890","postedAt": "2026-06-02T15:41:02Z","snippet": "We're a fast-growing fintech startup based in NYC looking for a Senior Backend Engineer to join our small team. This is a remote-friendly role open to candidates in EU timezones. Tech stack: Python, Postgres, Kafka. Email jobs@example.com with your resume and a short note about a project you're proud of.","threadId": "40560000"}
The second example shows the common case: no | convention, so company falls back to a truncated blob of the opening sentence and location is null. This is expected, not a parsing failure — treat company/location as best-effort signals and snippet/title as the reliable full text.
Pricing
Pay per event: $0.05 per Actor start and $0.004 per job returned. 100 jobs ≈ $0.45. No subscription, no rental — you pay only for what you fetch.
Use cases
- Tracking startup and YC-adjacent hiring monthly
- Feeding job boards with hard-to-find startup roles
- Sourcing engineering-heavy openings before they hit job boards
- Hiring-trend analysis on HN data
FAQ
Is it legal to scrape Hacker News Who is Hiring? This Actor reads only publicly available job postings — data any visitor can see without logging in. No personal data behind authentication is touched. Review the target site's terms and your local regulations for your specific use case.
Do I need an account on the target site? No. Postings are fetched from public pages/APIs — no login, cookies or session tokens.
How fresh is the data?
Every run fetches live listings. Results are cached for cacheTtlSeconds (default 30 min, set 0 to always hit the source live).
How many jobs can I get?
maxItems caps the run (set 0 where supported for no cap). Most sources paginate from newest to oldest.
Something broken or missing? Open an issue on the Actor's Issues tab — it is monitored and reliability fixes ship fast.