Hacker News Job Scraper: Who is Hiring Posts avatar

Hacker News Job Scraper: Who is Hiring Posts

Pricing

from $20.00 / 1,000 jobs

Go to Apify Store
Hacker News Job Scraper: Who is Hiring Posts

Hacker News Job Scraper: Who is Hiring Posts

Scrape Hacker News Who is Hiring job posts into structured JSON. Extract company, role, salary, remote status, tech stack, emails, and application URLs. Drop-in for Google Sheets, Airtable, and Zapier. Skip manual copy-paste. $0.02 per job.

Pricing

from $20.00 / 1,000 jobs

Rating

0.0

(0)

Developer

GetAScraper

GetAScraper

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Share

HN Who is Hiring Scraper

Extract structured job postings from Hacker News monthly "Who is Hiring?" threads. Parse company, role, location, remote status, salary, technologies, emails, and application URLs from the largest organic tech job board on the internet.

Built on the official Hacker News Firebase API and Algolia Search API for reliable, rate-limit-free access to job data.

Why use it?

  • Structured Data: Extracts company, role, location, remote status, salary, technologies, emails, and URLs from unstructured HN comments
  • Auto-Discovery: Automatically finds the latest "Who is Hiring?" posts without needing specific URLs
  • Historical Data: Scrape multiple months back for trend analysis
  • Tech-Focused: Identifies technologies mentioned in each job post for filtering and analysis
  • Contact Extraction: Automatically finds email addresses and application URLs

How to use

  1. Open the Actor in Apify Console.
  2. Leave startUrls empty to auto-discover the latest hiring post, or provide specific HN post URLs.
  3. Set monthsBack to scrape multiple months (max 12).
  4. Set maxJobsPerMonth to limit results (0 = unlimited).
  5. Optionally enable includeReplies to capture nested discussion threads.
  6. Run the Actor and consume the output via Apify API, CSV, or JSON.

Input fields

  • startUrls (array, optional): Specific HN "Who is Hiring?" post URLs. If empty, auto-discovers the latest posts.
  • monthsBack (integer): How many months of hiring posts to scrape when auto-discovering. Default: 1, Max: 12.
  • maxJobsPerMonth (integer): Maximum job postings to extract per month. Default: 0 (unlimited).
  • includeReplies (boolean): Whether to include nested replies/discussion threads. Default: false.
  • proxyConfiguration (object): Proxy configuration for API requests. Optional - HN APIs are generally open.

Output schema

Each dataset item represents one job posting:

{
"commentId": 22666455,
"hnUser": "kfx",
"postedAt": 1584984975,
"postedAtIso": "2020-03-23T17:36:15.000Z",
"rawText": "PBS | Various Engineers | Full-Time | ONSITE...",
"cleanText": "PBS | Various Engineers | Full-Time | ONSITE...",
"company": "PBS",
"role": "Various Engineers",
"location": "Alexandria, VA",
"remoteStatus": "ONSITE (Flexible WFH)",
"employmentType": "Full-Time",
"salary": null,
"technologies": ["express", "iOS"],
"emails": ["digitaljobs@pbs.org"],
"urls": ["https://tinyurl.com/v7c8nb2"],
"isTopLevel": true,
"parentId": 22665398,
"replyCount": 0,
"hnUrl": "https://news.ycombinator.com/item?id=22666455"
}

Data table

FieldTypeDescription
commentIdnumberUnique HN comment ID
companystringCompany name (extracted from post)
rolestringJob role/title
locationstringJob location
remoteStatusstringREMOTE, ONSITE, HYBRID, etc.
employmentTypestringFull-time, Contract, Intern, etc.
salarystringSalary range if found in text
technologiesarrayTechnologies mentioned in the post
emailsarrayEmail addresses found
urlsarrayApplication/company URLs found
hnUserstringHN username who posted the job
postedAtIsostringISO timestamp of the post
replyCountnumberNumber of replies to this job post
hnUrlstringDirect link to the comment on HN

Pricing / cost estimation

Priced at $0.02 per job posting (Pay-per-Result).

Target JobsEstimated Cost
100$2.00
500$10.00
1,000$20.00

HN APIs are open and free to access. No proxy costs typically required.

Tips / Advanced

  • Auto-Discovery: Leave startUrls empty and set monthsBack to 3-6 to get a rolling window of hiring posts
  • Focus on Remote: Filter output by remoteStatus field containing "REMOTE"
  • Tech Filtering: Use the technologies array to find jobs matching specific skills
  • Speed: Each API call has a 50ms delay to be respectful to HN. Expect ~20 jobs/minute

FAQ

Is scraping Hacker News legal? HN provides official APIs (Firebase and Algolia) for accessing this data. This Actor uses those APIs, not HTML scraping.

Why did I get fewer results than expected? Some comments in hiring threads are discussion, not job posts. The parser attempts to filter these, but imperfectly. Set includeReplies: true to capture more.

Can I scrape historical data? Yes. Set monthsBack up to 12 to scrape past hiring threads. Note that older posts may have fewer active listings.

Support

For bug reports or feature requests, open a ticket in the Issues tab.