Pricing

from $50.00 / 1,000 story records

Go to Apify Store

Hacker News Scraper — Tech News Feed API

Try for free

Monitor Hacker News stories and engagement trends. Clean JSON for PR, media-monitoring teams and AI agents.

Pricing

from $50.00 / 1,000 story records

Rating

0.0

(0)

Developer

NexGenData

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

🔍 Hacker News Scraper & Tech Trend Tracker

Extract trending stories, comments, and metadata from Hacker News at scale. A drop-in alternative to the HN Algolia API and the official Firebase API — with bulk pagination, comment-thread expansion, and structured JSON output, no rate-limit headaches.

Why This HN Scraper Beats HN Algolia, Firebase API & Manual Polling

Feature	NexGenData Hacker News Scraper	HN Algolia API	HN Official Firebase API	Manual cron + scraping
Cost	$5 / 1,000 results, pay-per-event	Free but rate-limited	Free but no batch	Engineering time + infra
Bulk pagination	Up to 500 stories per run	Plan-limited	One ID at a time	Build it yourself
Comment threads	Full nested comments per story	Separate calls	Walk descendants tree manually	Build it yourself
Story feeds	top, new, best, ask, show, job	Limited categories	Yes (one per call)	Build it yourself
Output format	JSON / CSV / Excel	JSON	JSON	Whatever you write
Schedule + webhook	Native cron + webhook on completion	None	None	Build it yourself
Time-to-first-row	< 60 seconds	Signup needed	Yes (slow per-call)	Days
Auth	Apify token	None (anon)	None	Your IP / proxy
Maintenance	We handle it	Algolia handles it	You handle Firebase quirks	You handle everything

Most teams pick this scraper because it is faster than walking the Firebase descendants tree by hand and more flexible than Algolia's fixed search index — and it ships JSON straight to a dataset, no Firebase SDK required.

What This Actor Does

The Hacker News Scraper connects directly to Hacker News' Firebase API to extract stories, comments, and metadata in seconds. No parsing, no rate limits, no complex API documentation. Whether you're tracking tech trends, monitoring startup mentions, or feeding AI training data, this actor delivers structured JSON output you can use immediately.

Perfect for:

Startups building competitive intelligence systems
Data scientists gathering training datasets
Content strategists tracking industry discussions
Researchers analyzing tech community behavior
Automated news feeds and aggregators

Why Scrape Hacker News?

Hacker News data extraction powers decision-making across tech companies. HN discussions reveal product launches before major announcements, engineering challenges competitors are solving, investor and founder sentiment shifts, early signals for emerging technologies, and real-time feedback on industry trends.

Key Features

Search Multiple Story Types

Need top Hacker News stories? Use searchType: top. Want trending HN tech news? Try searchType: best. The actor supports all six story feeds: top (frontpage stories), new (recently submitted), best (ranked by score with visibility weighting), ask (Ask HN discussions), show (Show HN project submissions), and job (job postings and hiring).

Fetch Exact Result Counts

Set maxResults from 1 to 500. Whether you need the top 10 Hacker News articles for a daily brief or 500 HN stories for machine learning training data, get exactly what you specify.

Include Full Comment Threads

Set includeComments: true to attach every comment under each story. Extract sentiment, track discussions, build comment datasets. With includeComments: false, run faster and leaner when you only need stories.

Fast Execution

Leverages HN's Firebase backend for speed. Most requests complete in under 30 seconds.

Real-World Use Cases

1. Competitive Intelligence Dashboard

Automatically surface mentions of competitors, their products, and industry discussions daily. Feed results into a dashboard that flags stories mentioning competitor names. Set it to run daily on searchType: new with maxResults: 100. Sales teams get alerts when competitors are discussed, what people like about them, and what criticism appears in comments.

2. AI Training Dataset for Tech Sentiment Analysis

Build production-grade datasets for fine-tuning LLMs on real tech conversations. A 500-result scrape with includeComments: true gives you 10,000-50,000 comments across stories. Combined with story scores and timestamps, you have labeled sentiment data.

Run the actor daily on searchType: top, extract titles and top-voted comments, feed into your newsletter template. Readers see what the HN community is discussing with context from real discussions.

4. Job Board Aggregation

Set searchType: job and maxResults: 100 to scrape HN's job listings. Automatically notify candidates when companies in target cities are hiring. Extract company names and roles from structured output.

Input Parameters

Parameter	Type	Range	Description
`searchType`	string	top, new, best, ask, show, job	Which HN feed to scrape. Default: `top`
`maxResults`	number	1-500	How many stories to extract. Default: 30
`includeComments`	boolean	true/false	Attach all comments under each story. Default: `false`

Quick Start

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/hacker-news-scraper").call(run_input={
    "searchType": "top",
    "maxResults": 100,
    "includeComments": False,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item.get("title"), item.get("score"))

Sample Output

{
  "id": 42840302,
  "title": "Building a machine learning model for production",
  "url": "https://example.com/ml-guide",
  "score": 487,
  "descendants": 142,
  "time": 1711723200,
  "type": "story",
  "by": "techauthor",
  "comments": [
    {"id": 42840910, "text": "Great breakdown...", "score": 52, "by": "commentor1"}
  ]
}

Pricing: $5 per 1,000 Results

Cost breakdown: Scrape 30 stories = $0.15. Scrape 100 stories = $0.50. Scrape 500 stories = $2.50.

Building it yourself costs more: 40+ hours to write, test, and deploy a reliable HN scraper (~$2,000 in dev time), plus 5-10 hours/month in maintenance when things change.

FAQ

Will this scraper get blocked or rate-limited? No. The actor uses Hacker News' own Firebase API, which is public and official. No rate limits, no blocking risk. HN publicly documents and allows automated access via this API.

How fresh is the data? Real-time. The actor pulls directly from HN's live database. Stories appear in your output within seconds of being posted.

Can I schedule this to run daily automatically? Yes. Apify handles scheduling natively. Set up a daily run on your preferred search type and let it populate your database automatically.

Is my data private? Completely. All data stays within your Apify account. nexgendata has no access to results, metadata, or usage patterns.

How is this different from the HN Algolia API? HN Algolia is a search index built on top of HN — great for full-text search across years of HN history, but the rate limit is real and the JSON shape doesn't include the comment tree. This actor walks the Firebase tree for you and ships flat comment arrays.

Agentic payments (AI agent buyers welcome)

This actor supports autonomous payment via Skyfire — AI agents (Claude Desktop, OpenCode, Cursor, Vercel AI SDK, OpenAI Agents SDK) can discover, fund, and invoke it without a human-in-the-loop credit card flow.

Agents using Apify's MCP server can find this actor by searching for Hacker News stories, YC News trend monitoring, or tech community signals and pay via a Skyfire PAY token (minimum $5 prefund). The existing pay-per-event pricing applies unchanged — the agent funds a token, runs the actor at the published per-result rate, and unused balance returns to the wallet on expiry.

Compatible agent frameworks:

Apify's official MCP server (mcp.apify.com)
Claude Desktop with Apify MCP integration
OpenCode + Apify MCP
OpenAI Agents SDK + Skyfire toolkit (via Composio)
Vercel AI SDK + Skyfire toolkit (via Composio)

No code changes needed on the actor side — the integration is fully on Apify's infrastructure. AI agents discover via allowsAgenticUsers=true filter on Apify's store API.

Use case	Actor
Show HN launch tracker	HN Show HN Tracker
Reddit subreddit trend & post tracker	Reddit Subreddit Trends
News & media monitoring for AI agents	News MCP Server
Indie Hackers product launches	Indie Hackers Products Tracker
Product Hunt launches tracker	Product Hunt Launches Scraper
Wikipedia structured-knowledge scraper	Wikipedia Scraper
Google Scholar paper search	Google Scholar Scraper
arXiv preprint search	arXiv Scraper

📰 The NexGenData Newswire & News Suite

Don't monitor one wire — cover them all. Pair this with the rest of the suite for complete PR, press-release, and news coverage from a single vendor with one consistent output schema.

Press-release wires

PR Newswire — US corporate announcements & earnings releases
PR Newswire Asia — APAC corporate announcements
Business Wire — company press releases & disclosures
GlobeNewswire — listed-company news & regulatory filings
EIN Presswire — broad-distribution press releases
PR Web — SMB & small-business press releases

News & headlines

AP News — Associated Press breaking news & articles
BBC News — global BBC headlines & articles
Google News — aggregated headlines & trending topics
Hacker News — tech & startup stories and discussion (← you are here)
Crunchbase News — funding rounds, M&A & startup headlines

Regional / regulatory

Investegate RNS — UK LSE/AIM regulatory (RNS) announcements

About NexGenData

NexGenData publishes 260+ buyer-intent actors covering SEC filings, YC alumni, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, and more. All pay-per-result. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b

How NexGenData Pricing Works

Every NexGenData actor uses pay-per-event pricing — you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.

Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
Result: charged per item written to the default dataset
No charge for retries, internal proxy rotation, or failed sub-requests — those are absorbed by the platform

If you only need the data once a quarter, you only pay once a quarter. If you scale to millions of records, the unit cost stays the same.

Apify Platform Bonus

New to Apify? Sign up with the NexGenData referral link — you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.

Integration Surface

Every actor in the NexGenData catalog can be triggered from:

Apify console — point-and-click run
Apify API — REST + webhooks
Apify Python / JS SDKs — programmatic batch
Zapier, Make.com, n8n — official integrations
MCP — many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
Schedules — built-in cron for daily / weekly / monthly runs
Webhooks — POST results to any HTTPS endpoint on dataset write

Support

NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome — high-demand features ship in the next version.

Home: thenextgennexus.com Full catalog: apify.com/nexgendata

💰 Pricing Example

This actor uses Pay-Per-Event pricing — you only pay for results.

Typical run (small): 100 results × $0.05 = $5.00
Medium run: 500 results × $0.05 = $25.00
Large run (power user): 2,000 results × $0.05 = $100.00

Free Apify accounts get $5/mo in platform credit. A typical tech community intel workflow at this scale typically exceeds the free credit — upgrade to a paid Apify plan for unrestricted use.

Pair with these for a complete workflow:

🎬 HN Show HN Tracker — Indie Product Launch Stream — specialized monitor for Show HN launches with upvote velocity scoring
🚀 Product Hunt Scraper — Launches & Startups — monitor Product Hunt launches and daily rankings
🔴 Reddit MCP — AI Post & Comment Search — MCP server for Reddit search + subreddit monitoring
🔍 YouTube Comments Scraper — Bulk Comment Extractor — scrape YouTube comments for sentiment + audience signals

⭐ Found this useful?

If this Actor saved you time, a quick review on the Apify Store genuinely helps other teams discover it. Have a feature request or hit a problem? Open it from the Issues tab — we read every one.

Hacker News Scraper

sweet_rebel/hacker-news-scraper

Rajat Sharda

Hacker News Scraper

klondikeking/hacker-news-scraper

Pierrick McD0nald

Hacker News Scraper - Stories, Comments & Trends

viralanalyzer/hackernews-intelligence

Scrape Hacker News stories, comments, and discussions. Track tech trends, startup news, and developer community sentiment.

viralanalyzer

5.0

Hacker News Search Scraper

sthiven_r/hacker-news-search-scraper

Search Hacker News by keyword and get stories (title, URL, points, comments, author, date). For tech monitoring & research.

Wilker Sthiven Rangel Manrique

Hacker News Scraper - Stories & Comments

spiky_pepperoni/hacker-news-scraper

Search Hacker News stories and comments by keyword. No login.

Arad S

Hacker News AI Trends Scraper

scenic_telescope/a2a-hackernews-ai-trends-scraper

Track AI, developer tools, security, and startup stories from Hacker News public feeds.

ping

Hacker News Live Feed

desmond-dev/hacker-news-tech-trends

Real-time top stories from Hacker News (Y Combinator). Fetches title, URL, score, and comments. Perfect for tracking tech trends, AI news, and startup buzz.

Desmond Chigariro

Hacker News Scraper

muscular_quadruplet/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles. Extract top stories, new posts, Show HN, Ask HN. Monitor tech trends, track discussions, build news aggregators. Real-time tech news scraping.

Do It

Google News Scraper — Breaking News Feed API

nexgendata/google-news-scraper

Monitor Google News for breaking-news coverage. Clean JSON for PR, media-monitoring teams and AI agents.

NexGenData

Hacker News Watchlist and Story Monitor

skootle/hackernews-watchlist

Monitor Hacker News stories and keywords with scores, comments, URLs, authors, and discussion metadata for startup research, tech trend monitoring, and AI agents.

Skootle

Hacker News Scraper — Tech News Feed API

🔍 Hacker News Scraper & Tech Trend Tracker

Why This HN Scraper Beats HN Algolia, Firebase API & Manual Polling

What This Actor Does

Why Scrape Hacker News?

Key Features

Search Multiple Story Types

Fetch Exact Result Counts

Include Full Comment Threads

Fast Execution

Real-World Use Cases

1. Competitive Intelligence Dashboard

2. AI Training Dataset for Tech Sentiment Analysis

3. Automated Newsletter Content

4. Job Board Aggregation

Input Parameters

Quick Start

Sample Output

Pricing: $5 per 1,000 Results

FAQ

Agentic payments (AI agent buyers welcome)

Related NexGenData Actors

📰 The NexGenData Newswire & News Suite

About NexGenData

How NexGenData Pricing Works

Apify Platform Bonus

Integration Surface

Support

💰 Pricing Example

🔗 Related Actors

⭐ Found this useful?

You might also like

Hacker News Scraper

Hacker News Scraper

Hacker News Scraper - Stories, Comments & Trends

Hacker News Search Scraper

Hacker News Scraper - Stories & Comments

Hacker News AI Trends Scraper

Hacker News Live Feed

Hacker News Scraper

Google News Scraper — Breaking News Feed API

Hacker News Watchlist and Story Monitor