🔶 Hacker News Scraper — Stories & Tech Trends
Pricing
from $2.00 / 1,000 results
🔶 Hacker News Scraper — Stories & Tech Trends
Scrape Hacker News stories — top, new, best, ask, show, jobs. Engagement tracking, trend analysis, and tech topic monitoring.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
Stephan Corbeil
Actor stats
0
Bookmarked
14
Total users
7
Monthly active users
a day ago
Last modified
Categories
Share
Hacker News Scraper by nexgendata
Extract trending stories, comments, and metadata from Hacker News at scale. Automate HN data collection without building an API client.
What This Actor Does
The Hacker News Scraper connects directly to Hacker News' Firebase API to extract stories, comments, and metadata in seconds. No parsing, no rate limits, no complex API documentation. Whether you're tracking tech trends, monitoring startup mentions, or feeding AI training data, this actor delivers structured JSON output you can use immediately.
Perfect for:
- Startups building competitive intelligence systems
- Data scientists gathering training datasets
- Content strategists tracking industry discussions
- Researchers analyzing tech community behavior
- Automated news feeds and aggregators
Why Scrape Hacker News?
Hacker News data extraction powers decision-making across tech companies. HN discussions reveal product launches before major announcements, engineering challenges competitors are solving, investor and founder sentiment shifts, early signals for emerging technologies, and real-time feedback on industry trends.
This actor eliminates the friction between wanting that data and actually having it.
Key Features
Search Multiple Story Types
Need top Hacker News stories? Use searchType: top. Want trending HN tech news? Try searchType: best. The actor supports all six story feeds: top (frontpage stories), new (recently submitted), best (ranked by score with visibility weighting), ask (Ask HN discussions), show (Show HN project submissions), and job (job postings and hiring).
Fetch Exact Result Counts
Set maxResults from 1 to 500. Whether you need the top 10 Hacker News articles for a daily brief or 500 HN stories for machine learning training data, get exactly what you specify.
Include Full Comment Threads
Set includeComments: true to attach every comment under each story. Extract sentiment, track discussions, build comment datasets. With includeComments: false, run faster and leaner when you only need stories.
Fast Execution
Leverages HN's Firebase backend for speed. Most requests complete in under 30 seconds.
Real-World Use Cases
1. Competitive Intelligence Dashboard
Automatically surface mentions of competitors, their products, and industry discussions daily. Feed results into a dashboard that flags stories mentioning competitor names. Set it to run daily on searchType: new with maxResults: 100. Sales teams get alerts when competitors are discussed, what people like about them, and what criticism appears in comments.
2. AI Training Dataset for Tech Sentiment Analysis
Build production-grade datasets for fine-tuning LLMs on real tech conversations. A 500-result scrape with includeComments: true gives you 10,000-50,000 comments across stories. Combined with story scores and timestamps, you have labeled sentiment data.
3. Automated Newsletter Content
Run the actor daily on searchType: top, extract titles and top-voted comments, feed into your newsletter template. Readers see what the HN community is discussing with context from real discussions.
4. Job Board Aggregation
Set searchType: job and maxResults: 100 to scrape HN's job listings. Automatically notify candidates when companies in target cities are hiring. Extract company names and roles from structured output.
Input Parameters
| Parameter | Type | Range | Description |
|---|---|---|---|
searchType | string | top, new, best, ask, show, job | Which HN feed to scrape. Default: top |
maxResults | number | 1-500 | How many stories to extract. Default: 30 |
includeComments | boolean | true/false | Attach all comments under each story. Default: false |
Sample Output
{"stories": [{"id": 42840302,"title": "Building a machine learning model for production","url": "https://example.com/ml-guide","score": 487,"descendants": 142,"time": 1711723200,"type": "story","by": "techauthor","comments": [{"id": 42840910,"text": "Great breakdown of deployment challenges...","score": 52,"by": "commentor1","time": 1711726800,"parent": 42840302}]}],"requestParams": {"searchType": "top","maxResults": 1,"includeComments": true}}
Pricing: $5 per 1,000 Results
Cost breakdown: Scrape 30 stories = $0.15. Scrape 100 stories = $0.50. Scrape 500 stories = $2.50.
Why This Price Is Worth It
Building it yourself costs more: 40+ hours to write, test, and deploy a reliable HN scraper (~$2,000 in dev time), plus 5-10 hours/month in maintenance when things change. Infrastructure costs for servers, monitoring, retry logic, and error handling add up.
Real math: One engineer maintaining an in-house scraper costs $60-150/hour. At 2 hours/month in maintenance, you're paying $120-300/month just to keep it working. This actor costs $5/month at 1,000 monthly results.
FAQ
Will this scraper get blocked or rate-limited? No. The actor uses Hacker News' own Firebase API, which is public and official. No rate limits, no blocking risk. HN publicly documents and allows automated access via this API.
How fresh is the data? Real-time. The actor pulls directly from HN's live database. Stories appear in your output within seconds of being posted.
Can I schedule this to run daily automatically? Yes. Apify handles scheduling natively. Set up a daily run on your preferred search type and let it populate your database automatically.
Is my data private? Completely. All data stays within your Apify account. nexgendata has no access to results, metadata, or usage patterns.