🔶 Hacker News Scraper — Stories & Tech Trends avatar

🔶 Hacker News Scraper — Stories & Tech Trends

Pricing

from $2.00 / 1,000 results

Go to Apify Store
🔶 Hacker News Scraper — Stories & Tech Trends

🔶 Hacker News Scraper — Stories & Tech Trends

Scrape Hacker News stories — top, new, best, ask, show, jobs. Engagement tracking, trend analysis, and tech topic monitoring.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

14

Total users

7

Monthly active users

a day ago

Last modified

Share

Hacker News Scraper by nexgendata

Extract trending stories, comments, and metadata from Hacker News at scale. Automate HN data collection without building an API client.

What This Actor Does

The Hacker News Scraper connects directly to Hacker News' Firebase API to extract stories, comments, and metadata in seconds. No parsing, no rate limits, no complex API documentation. Whether you're tracking tech trends, monitoring startup mentions, or feeding AI training data, this actor delivers structured JSON output you can use immediately.

Perfect for:

  • Startups building competitive intelligence systems
  • Data scientists gathering training datasets
  • Content strategists tracking industry discussions
  • Researchers analyzing tech community behavior
  • Automated news feeds and aggregators

Why Scrape Hacker News?

Hacker News data extraction powers decision-making across tech companies. HN discussions reveal product launches before major announcements, engineering challenges competitors are solving, investor and founder sentiment shifts, early signals for emerging technologies, and real-time feedback on industry trends.

This actor eliminates the friction between wanting that data and actually having it.

Key Features

Search Multiple Story Types

Need top Hacker News stories? Use searchType: top. Want trending HN tech news? Try searchType: best. The actor supports all six story feeds: top (frontpage stories), new (recently submitted), best (ranked by score with visibility weighting), ask (Ask HN discussions), show (Show HN project submissions), and job (job postings and hiring).

Fetch Exact Result Counts

Set maxResults from 1 to 500. Whether you need the top 10 Hacker News articles for a daily brief or 500 HN stories for machine learning training data, get exactly what you specify.

Include Full Comment Threads

Set includeComments: true to attach every comment under each story. Extract sentiment, track discussions, build comment datasets. With includeComments: false, run faster and leaner when you only need stories.

Fast Execution

Leverages HN's Firebase backend for speed. Most requests complete in under 30 seconds.

Real-World Use Cases

1. Competitive Intelligence Dashboard

Automatically surface mentions of competitors, their products, and industry discussions daily. Feed results into a dashboard that flags stories mentioning competitor names. Set it to run daily on searchType: new with maxResults: 100. Sales teams get alerts when competitors are discussed, what people like about them, and what criticism appears in comments.

2. AI Training Dataset for Tech Sentiment Analysis

Build production-grade datasets for fine-tuning LLMs on real tech conversations. A 500-result scrape with includeComments: true gives you 10,000-50,000 comments across stories. Combined with story scores and timestamps, you have labeled sentiment data.

3. Automated Newsletter Content

Run the actor daily on searchType: top, extract titles and top-voted comments, feed into your newsletter template. Readers see what the HN community is discussing with context from real discussions.

4. Job Board Aggregation

Set searchType: job and maxResults: 100 to scrape HN's job listings. Automatically notify candidates when companies in target cities are hiring. Extract company names and roles from structured output.

Input Parameters

ParameterTypeRangeDescription
searchTypestringtop, new, best, ask, show, jobWhich HN feed to scrape. Default: top
maxResultsnumber1-500How many stories to extract. Default: 30
includeCommentsbooleantrue/falseAttach all comments under each story. Default: false

Sample Output

{
"stories": [
{
"id": 42840302,
"title": "Building a machine learning model for production",
"url": "https://example.com/ml-guide",
"score": 487,
"descendants": 142,
"time": 1711723200,
"type": "story",
"by": "techauthor",
"comments": [
{
"id": 42840910,
"text": "Great breakdown of deployment challenges...",
"score": 52,
"by": "commentor1",
"time": 1711726800,
"parent": 42840302
}
]
}
],
"requestParams": {
"searchType": "top",
"maxResults": 1,
"includeComments": true
}
}

Pricing: $5 per 1,000 Results

Cost breakdown: Scrape 30 stories = $0.15. Scrape 100 stories = $0.50. Scrape 500 stories = $2.50.

Why This Price Is Worth It

Building it yourself costs more: 40+ hours to write, test, and deploy a reliable HN scraper (~$2,000 in dev time), plus 5-10 hours/month in maintenance when things change. Infrastructure costs for servers, monitoring, retry logic, and error handling add up.

Real math: One engineer maintaining an in-house scraper costs $60-150/hour. At 2 hours/month in maintenance, you're paying $120-300/month just to keep it working. This actor costs $5/month at 1,000 monthly results.

FAQ

Will this scraper get blocked or rate-limited? No. The actor uses Hacker News' own Firebase API, which is public and official. No rate limits, no blocking risk. HN publicly documents and allows automated access via this API.

How fresh is the data? Real-time. The actor pulls directly from HN's live database. Stories appear in your output within seconds of being posted.

Can I schedule this to run daily automatically? Yes. Apify handles scheduling natively. Set up a daily run on your preferred search type and let it populate your database automatically.

Is my data private? Completely. All data stays within your Apify account. nexgendata has no access to results, metadata, or usage patterns.