📰 Hacker News Scraper avatar

📰 Hacker News Scraper

Pricing

Pay per event

Go to Apify Store
📰 Hacker News Scraper

📰 Hacker News Scraper

Extract trending tech discussions, nested comment hierarchies, and post scores from Hacker News directly into structured JSON for custom RAG pipelines.

Pricing

Pay per event

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

5 days ago

Last modified

Share

Fetch Hacker News top, new, best, ask, show, job stories via official Firebase API. Filter by score, analyze top domains. Stable 10+ year API.

Store Quickstart

Start with the Quickstart template (top stories, 20 items). For tech trend monitoring, use Top Trends with minScore=100 and domain analysis.

Key Features

  • 🔥 Official Firebase API — hacker-news.firebaseio.com — 10+ year stable
  • 📂 6 story modes — top, new, best, ask, show, job
  • Score filtering — Minimum score threshold for quality filtering
  • 💬 Comment threads — Optional nested comment extraction
  • 🏷️ Top domains analysis — Which domains dominate the front page
  • 🔑 No API key needed — Public Firebase API

Use Cases

WhoWhy
Tech journalistsDaily Hacker News trend reports
Startup foundersWatch which tools/frameworks gain HN traction
VCs/InvestorsSignal for emerging tech and founder announcements
Developer tool companiesMonitor HN sentiment on products and competitors
AI/ML researchersDiscover papers and repos trending in tech community

Input

FieldTypeDefaultDescription
modestringtoptop, new, best, ask, show, job
maxItemsinteger30Max stories (1-500)
minScoreinteger0Minimum score filter
includeCommentsbooleanfalseInclude comment threads

Input Example

{
"mode": "top",
"maxItems": 30,
"minScore": 100,
"includeComments": false
}

Output

FieldTypeDescription
idintegerHN story ID
titlestringStory title
urlstringExternal URL (if any)
authorstringHN username
scoreintegerUpvote score
numCommentsintegerComment count
createdAtstringISO timestamp
hnUrlstringHacker News thread URL
commentsobject[]Top comments (if includeComments enabled)

Output Example

{
"id": 12345678,
"title": "Claude 4.5 released with new features",
"url": "https://anthropic.com/news/claude-4-5",
"score": 523,
"by": "user123",
"time": 1712345678,
"descendants": 142,
"type": "story"
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~hacker-news-intelligence/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "mode": "top", "maxItems": 30, "minScore": 100, "includeComments": false }'

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/hacker-news-intelligence").call(run_input={
"mode": "top",
"maxItems": 30,
"minScore": 100,
"includeComments": false
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/hacker-news-intelligence').call({
"mode": "top",
"maxItems": 30,
"minScore": 100,
"includeComments": false
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

  • Use mode: "top" for the front page, "new" for breaking submissions.
  • Set minScore: 50 to filter out noise and focus on signal.
  • Schedule daily to track trending dev/startup topics.
  • Combine with Article Content Extractor to fetch full content of linked stories.

FAQ

What does score mean?

Net upvotes (upvotes minus downvotes). 100+ is front-page quality. 500+ is viral.

How often does the HN front page update?

Rapidly — rankings shift every few minutes. Scrape hourly for trend tracking.

Can I get old/archived stories?

Yes, the 'new' mode iterates chronologically; 'best' returns high-score stories over time.

What's the comment limit?

All comments under a story are available via the API. Comment-heavy posts slow down extraction.

What's the difference vs the official HN API?

This actor handles pagination, deduplication, comment threading, and outputs to Apify dataset — no SDK needed.

Can I search HN by keyword?

Use the Algolia HN search API for keyword search. This actor focuses on top/new/best feeds.

News & Content cluster — explore related Apify tools:

Cost

Pay Per Event:

  • actor-start: $0.01 (flat fee per run)
  • dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.