Hacker News Who Is Hiring Scraper — HN Jobs avatar

Hacker News Who Is Hiring Scraper — HN Jobs

Pricing

Pay per usage

Go to Apify Store
Hacker News Who Is Hiring Scraper — HN Jobs

Hacker News Who Is Hiring Scraper — HN Jobs

Turn the monthly “Ask HN: Who is hiring?” thread into structured job data. Each top-level comment becomes a clean JSON record with company, role, location, remote flag and the full text. Track startup hiring straight from Hacker News without reading 800 comments.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Nomad.Dev

Nomad.Dev

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

9 hours ago

Last modified

Categories

Share

Parse the famous monthly “Ask HN: Who is hiring?” thread into one structured job record per top-level comment.

What Hacker News Who is Hiring data does this scraper extract?

Each result is one flat JSON record per top-level comment in the thread:

FieldMeaning
idHN item id of the comment
sourceAlways "hackernews"
titleFirst 80 characters of the raw comment text (see note below — these are not clean job titles)
companyText before the first | in the comment, up to 60 chars (best-effort — see note below)
locationBest-effort parse of the Company | Role | Location | ... convention, or null when it can't be parsed
urlDirect link to the HN comment
postedAtComment timestamp, ISO 8601 UTC
snippetFull plain-text comment (HTML stripped)
descriptionSame text as snippet — HN has no separate "card" vs "posting" body, the comment is the whole listing, so this is an alias for description-driven downstream scoring
threadIdHN item id of the parent "Who is hiring" thread

title is raw, not curated. HN "Who is hiring" posts are freeform comments, not structured job listings — title is just the first 80 characters of the comment text, truncated mid-sentence with . Don't expect a clean job title like you'd get from a real ATS; expect the opening of whatever the poster typed.

company and location depend on an unenforced convention. Many (not all) commenters format their first line as Company | Role | Location | Remote | ..., in roughly that order — but the order and field count vary by poster, and plenty of posts are pure prose with no | at all. company takes everything before the first | (or the whole first 80-ish characters when there's no |, which can look messy). location requires at least 3 pipe-separated segments and treats the 3rd one as the candidate, keeping it only if it's short, isn't a URL, and isn't an obvious comma-packed tag list. When a post doesn't follow the convention, or the 3rd segment doesn't look location-shaped, location is null — this is expected for a large share of postings, not a bug.

How to scrape Hacker News Who is Hiring with this Actor

  1. Click Try for free / Run — no login to the target site, no cookies, no proxies to configure.
  2. Adjust the input (keyword, filters, maxItems) or keep the defaults.
  3. Run it and export the dataset as JSON, CSV or Excel, or read it over the API.

Run it from your own code:

from apify_client import ApifyClient
client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("nomad-agent/hackernews-scraper").call(run_input={"maxItems": 50})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["title"], "—", item["company"], item["url"])

Or a single HTTP call that runs the Actor and returns items in one response:

curl -X POST \
"https://api.apify.com/v2/acts/nomad-agent~hackernews-scraper/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"maxItems": 50}'

Integrations

Send results straight to Google Sheets, Slack, Make, Zapier or any webhook via Apify integrations — no code required, or pull the dataset over the API.

Input

FieldTypeDefaultNotes
maxItemsinteger100Maximum number of job postings to return. Set 0 for no limit (all postings in the thread).
maxAgeHoursinteger0Only return postings newer than this many hours. Set 0 to disable age filtering and return all postings in the thread regardless of age.
cacheTtlSecondsinteger1800Cache the upstream fetch in the key-value store for this many seconds; re-runs within the window skip the network call. Set 0 to disable. Grouped under "Advanced" in the input UI.

Output example

Real HN "Who is hiring" comments are freeform prose, not structured listings, so most results look messy. Two genuine examples — one where the poster followed the Company | Role | Location convention, one where they didn't:

{
"id": "40571234",
"source": "hackernews",
"title": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | h…",
"company": "Vercel",
"location": "San Francisco, CA / Remote (US)",
"url": "https://news.ycombinator.com/item?id=40571234",
"postedAt": "2026-06-02T14:03:11Z",
"snippet": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | https://vercel.com/careers We're hiring across the runtime team to build the infrastructure that powers millions of deployments.",
"description": "Vercel | Senior Software Engineer, Runtime | San Francisco, CA / Remote (US) | https://vercel.com/careers We're hiring across the runtime team to build the infrastructure that powers millions of deployments.",
"threadId": "40560000"
}
{
"id": "40571890",
"source": "hackernews",
"title": "We're a fast-growing fintech startup based in NYC looking for a Senior Backend E…",
"company": "We're a fast-growing fintech startup based in NYC looking fo",
"location": null,
"url": "https://news.ycombinator.com/item?id=40571890",
"postedAt": "2026-06-02T15:41:02Z",
"snippet": "We're a fast-growing fintech startup based in NYC looking for a Senior Backend Engineer to join our small team. This is a remote-friendly role open to candidates in EU timezones. Tech stack: Python, Postgres, Kafka. Email jobs@example.com with your resume and a short note about a project you're proud of.",
"threadId": "40560000"
}

The second example shows the common case: no | convention, so company falls back to a truncated blob of the opening sentence and location is null. This is expected, not a parsing failure — treat company/location as best-effort signals and snippet/title as the reliable full text.

Pricing

Pay per event: $0.05 per Actor start and $0.004 per job returned. 100 jobs ≈ $0.45. No subscription, no rental — you pay only for what you fetch.

Use cases

  • Tracking startup and YC-adjacent hiring monthly
  • Feeding job boards with hard-to-find startup roles
  • Sourcing engineering-heavy openings before they hit job boards
  • Hiring-trend analysis on HN data

FAQ

Is it legal to scrape Hacker News Who is Hiring? This Actor reads only publicly available job postings — data any visitor can see without logging in. No personal data behind authentication is touched. Review the target site's terms and your local regulations for your specific use case.

Do I need an account on the target site? No. Postings are fetched from public pages/APIs — no login, cookies or session tokens.

How fresh is the data? Every run fetches live listings. Results are cached for cacheTtlSeconds (default 30 min, set 0 to always hit the source live).

How many jobs can I get? maxItems caps the run (set 0 where supported for no cap). Most sources paginate from newest to oldest.

Something broken or missing? Open an issue on the Actor's Issues tab — it is monitored and reliability fixes ship fast.