Hacker News Job Scraper: Who is Hiring Posts
Pricing
from $20.00 / 1,000 jobs
Hacker News Job Scraper: Who is Hiring Posts
Scrape Hacker News Who is Hiring job posts into structured JSON. Extract company, role, salary, remote status, tech stack, emails, and application URLs. Drop-in for Google Sheets, Airtable, and Zapier. Skip manual copy-paste. $0.02 per job.
Pricing
from $20.00 / 1,000 jobs
Rating
0.0
(0)
Developer
GetAScraper
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
HN Who is Hiring Scraper
Extract structured job postings from Hacker News monthly "Who is Hiring?" threads. Parse company, role, location, remote status, salary, technologies, emails, and application URLs from the largest organic tech job board on the internet.
Built on the official Hacker News Firebase API and Algolia Search API for reliable, rate-limit-free access to job data.
Why use it?
- Structured Data: Extracts company, role, location, remote status, salary, technologies, emails, and URLs from unstructured HN comments
- Auto-Discovery: Automatically finds the latest "Who is Hiring?" posts without needing specific URLs
- Historical Data: Scrape multiple months back for trend analysis
- Tech-Focused: Identifies technologies mentioned in each job post for filtering and analysis
- Contact Extraction: Automatically finds email addresses and application URLs
How to use
- Open the Actor in Apify Console.
- Leave
startUrlsempty to auto-discover the latest hiring post, or provide specific HN post URLs. - Set
monthsBackto scrape multiple months (max 12). - Set
maxJobsPerMonthto limit results (0 = unlimited). - Optionally enable
includeRepliesto capture nested discussion threads. - Run the Actor and consume the output via Apify API, CSV, or JSON.
Input fields
startUrls(array, optional): Specific HN "Who is Hiring?" post URLs. If empty, auto-discovers the latest posts.monthsBack(integer): How many months of hiring posts to scrape when auto-discovering. Default: 1, Max: 12.maxJobsPerMonth(integer): Maximum job postings to extract per month. Default: 0 (unlimited).includeReplies(boolean): Whether to include nested replies/discussion threads. Default: false.proxyConfiguration(object): Proxy configuration for API requests. Optional - HN APIs are generally open.
Output schema
Each dataset item represents one job posting:
{"commentId": 22666455,"hnUser": "kfx","postedAt": 1584984975,"postedAtIso": "2020-03-23T17:36:15.000Z","rawText": "PBS | Various Engineers | Full-Time | ONSITE...","cleanText": "PBS | Various Engineers | Full-Time | ONSITE...","company": "PBS","role": "Various Engineers","location": "Alexandria, VA","remoteStatus": "ONSITE (Flexible WFH)","employmentType": "Full-Time","salary": null,"technologies": ["express", "iOS"],"emails": ["digitaljobs@pbs.org"],"urls": ["https://tinyurl.com/v7c8nb2"],"isTopLevel": true,"parentId": 22665398,"replyCount": 0,"hnUrl": "https://news.ycombinator.com/item?id=22666455"}
Data table
| Field | Type | Description |
|---|---|---|
commentId | number | Unique HN comment ID |
company | string | Company name (extracted from post) |
role | string | Job role/title |
location | string | Job location |
remoteStatus | string | REMOTE, ONSITE, HYBRID, etc. |
employmentType | string | Full-time, Contract, Intern, etc. |
salary | string | Salary range if found in text |
technologies | array | Technologies mentioned in the post |
emails | array | Email addresses found |
urls | array | Application/company URLs found |
hnUser | string | HN username who posted the job |
postedAtIso | string | ISO timestamp of the post |
replyCount | number | Number of replies to this job post |
hnUrl | string | Direct link to the comment on HN |
Pricing / cost estimation
Priced at $0.02 per job posting (Pay-per-Result).
| Target Jobs | Estimated Cost |
|---|---|
| 100 | $2.00 |
| 500 | $10.00 |
| 1,000 | $20.00 |
HN APIs are open and free to access. No proxy costs typically required.
Tips / Advanced
- Auto-Discovery: Leave
startUrlsempty and setmonthsBackto 3-6 to get a rolling window of hiring posts - Focus on Remote: Filter output by
remoteStatusfield containing "REMOTE" - Tech Filtering: Use the
technologiesarray to find jobs matching specific skills - Speed: Each API call has a 50ms delay to be respectful to HN. Expect ~20 jobs/minute
FAQ
Is scraping Hacker News legal? HN provides official APIs (Firebase and Algolia) for accessing this data. This Actor uses those APIs, not HTML scraping.
Why did I get fewer results than expected?
Some comments in hiring threads are discussion, not job posts. The parser attempts to filter these, but imperfectly. Set includeReplies: true to capture more.
Can I scrape historical data?
Yes. Set monthsBack up to 12 to scrape past hiring threads. Note that older posts may have fewer active listings.
Support
For bug reports or feature requests, open a ticket in the Issues tab.