📰 Hacker News Scraper
Pricing
Pay per event
📰 Hacker News Scraper
Extract trending tech discussions, nested comment hierarchies, and post scores from Hacker News directly into structured JSON for custom RAG pipelines.
Pricing
Pay per event
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
5 days ago
Last modified
Categories
Share
Fetch Hacker News top, new, best, ask, show, job stories via official Firebase API. Filter by score, analyze top domains. Stable 10+ year API.
Store Quickstart
Start with the Quickstart template (top stories, 20 items). For tech trend monitoring, use Top Trends with minScore=100 and domain analysis.
Key Features
- 🔥 Official Firebase API — hacker-news.firebaseio.com — 10+ year stable
- 📂 6 story modes — top, new, best, ask, show, job
- ⭐ Score filtering — Minimum score threshold for quality filtering
- 💬 Comment threads — Optional nested comment extraction
- 🏷️ Top domains analysis — Which domains dominate the front page
- 🔑 No API key needed — Public Firebase API
Use Cases
| Who | Why |
|---|---|
| Tech journalists | Daily Hacker News trend reports |
| Startup founders | Watch which tools/frameworks gain HN traction |
| VCs/Investors | Signal for emerging tech and founder announcements |
| Developer tool companies | Monitor HN sentiment on products and competitors |
| AI/ML researchers | Discover papers and repos trending in tech community |
Input
| Field | Type | Default | Description |
|---|---|---|---|
| mode | string | top | top, new, best, ask, show, job |
| maxItems | integer | 30 | Max stories (1-500) |
| minScore | integer | 0 | Minimum score filter |
| includeComments | boolean | false | Include comment threads |
Input Example
{"mode": "top","maxItems": 30,"minScore": 100,"includeComments": false}
Output
| Field | Type | Description |
|---|---|---|
id | integer | HN story ID |
title | string | Story title |
url | string | External URL (if any) |
author | string | HN username |
score | integer | Upvote score |
numComments | integer | Comment count |
createdAt | string | ISO timestamp |
hnUrl | string | Hacker News thread URL |
comments | object[] | Top comments (if includeComments enabled) |
Output Example
{"id": 12345678,"title": "Claude 4.5 released with new features","url": "https://anthropic.com/news/claude-4-5","score": 523,"by": "user123","time": 1712345678,"descendants": 142,"type": "story"}
API Usage
Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.
cURL
curl -X POST "https://api.apify.com/v2/acts/taroyamada~hacker-news-intelligence/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{ "mode": "top", "maxItems": 30, "minScore": 100, "includeComments": false }'
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("taroyamada/hacker-news-intelligence").call(run_input={"mode": "top","maxItems": 30,"minScore": 100,"includeComments": false})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('taroyamada/hacker-news-intelligence').call({"mode": "top","maxItems": 30,"minScore": 100,"includeComments": false});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Tips & Limitations
- Use
mode: "top"for the front page,"new"for breaking submissions. - Set
minScore: 50to filter out noise and focus on signal. - Schedule daily to track trending dev/startup topics.
- Combine with Article Content Extractor to fetch full content of linked stories.
FAQ
What does score mean?
Net upvotes (upvotes minus downvotes). 100+ is front-page quality. 500+ is viral.
How often does the HN front page update?
Rapidly — rankings shift every few minutes. Scrape hourly for trend tracking.
Can I get old/archived stories?
Yes, the 'new' mode iterates chronologically; 'best' returns high-score stories over time.
What's the comment limit?
All comments under a story are available via the API. Comment-heavy posts slow down extraction.
What's the difference vs the official HN API?
This actor handles pagination, deduplication, comment threading, and outputs to Apify dataset — no SDK needed.
Can I search HN by keyword?
Use the Algolia HN search API for keyword search. This actor focuses on top/new/best feeds.
Related Actors
News & Content cluster — explore related Apify tools:
- 📰 Google News Scraper — Scrape Google News articles for any search query via official RSS feed.
- 📰 Article Extractor — Extract clean article content with title, author, publish date, images from news and blog pages.
- 📄 Website Content Extractor — Extract clean main content from any webpage as text, markdown, or HTML.
- 📡 RSS Feed Aggregator — Aggregate multiple RSS and Atom feeds with keyword filtering and deduplication.
- 📡 Reddit All-in-One Scraper — Scrape Reddit subreddits, posts, comments, user profiles, and search results via public JSON endpoints.
- 🚨 Reddit Keyword Monitor Alerts — Focused Reddit keyword and subreddit monitor built for recurring alerts, snapshot diffing, and webhook handoff.
Cost
Pay Per Event:
actor-start: $0.01 (flat fee per run)dataset-item: $0.003 per output item
Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01
No subscription required — you only pay for what you use.