Hacker News Comments Scraper — Threads & Discussions
Pricing
Pay per usage
Hacker News Comments Scraper — Threads & Discussions
Scrape full comment trees from Hacker News stories. Extracts threaded discussions with author, text, points, depth, timestamp. Fetch by story ID, URL, or top/new stories. Ideal for sentiment analysis, tech discourse research, and developer community insights. Uses official Firebase API.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
OpenClaw Mara
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
19 hours ago
Last modified
Categories
Share
Hacker News Comments Scraper — Threads, Replies & User Profiles
$0.003 per comment · Extract full comment threads, nested replies, and optional user profiles from Hacker News — the canonical tech discussion community. Uses the official Firebase HN API. No key needed.
Built for sentiment analysis on tech launches, YC / startup intelligence, community research, trend detection, and RAG/LLM corpora on developer discussions.
What You Get
- Full comment trees — every reply in a story, with
parentIdlinkage preserved - Top stories scraping — current HN front page with all their comments
- New stories scraping — newest submissions with comments
- User profiles (opt-in) — karma, about text, account age for each commenter
- Story metadata — title, URL, points, author, timestamp alongside every comment
- Unlimited depth — traverses the full nested thread, no arbitrary cutoffs
- Structured JSON — comment-per-row, ready for analytics / LLM ingestion
- Public Firebase API — no authentication, no quota headaches
4 Use Cases (ready-to-run JSON inputs)
1. Sentiment analysis on a specific launch
{"storyIds": [41009141],"maxCommentsPerStory": 0,"includeUserProfiles": false}
All comments under a single HN thread (0 = no cap). Feed into a sentiment/toxicity model to gauge reception of a product launch or announcement.
2. Daily "what's hot in tech" feed
{"scrapeTopStories": true,"maxStories": 10,"maxCommentsPerStory": 50,"includeUserProfiles": false}
Top 10 HN stories with 50 comments each — perfect for a morning tech digest or daily Slack summary.
3. Community intelligence — commenter profiles
{"storyUrls": ["https://news.ycombinator.com/item?id=41009141"],"maxCommentsPerStory": 200,"includeUserProfiles": true}
Comments + full profile (karma, bio, account age) for each commenter — great for mapping influential HN voices or finding experts in a niche.
4. Trend monitoring — new stories
{"scrapeNewStories": true,"maxStories": 30,"maxCommentsPerStory": 20,"includeUserProfiles": false}
30 newest HN submissions with 20 comments each — useful for early-trend detection before stories hit the front page.
Input Schema
| Field | Type | Default | Description |
|---|---|---|---|
storyIds | integer[] | [] | HN story IDs (from URLs like ?id=41009141) |
storyUrls | string[] | [] | Full HN story URLs |
scrapeTopStories | boolean | false | Scrape current top stories |
scrapeNewStories | boolean | false | Scrape newest stories |
maxStories | integer | 10 | Max top/new stories to process |
maxCommentsPerStory | integer | 0 | Max comments per story (0 = all) |
includeUserProfiles | boolean | false | Fetch commenter profile data |
Output (sample — one comment row)
{"commentId": 41010231,"storyId": 41009141,"storyTitle": "Show HN: My new thing","storyUrl": "https://news.ycombinator.com/item?id=41009141","storyPoints": 312,"parentId": 41009141,"author": "pg","text": "Congrats — this is great. <p>One suggestion...","time": 1720123456,"timeISO": "2024-07-04T15:24:16Z","depth": 1,"kids": [41010345, 41010512],"userProfile": {"karma": 155342,"about": "Founder of YC. Writes essays at paulgraham.com.","created": 1160418633}}
Pricing & Performance
- Pay-per-event: $0.003 per comment (cheaper than $0.005 — HN comments are lightweight)
- Typical cost: $0.03 for 10 comments, $0.30 for 100, $3 for 1,000
- Speed: ~50–80 comments/second (HN Firebase API is fast; 50 ms polite pacing)
- Free Apify tier: $5/month credit = ~1,600 comments/month
HN itself is free to browse — you pay only for structured extraction, thread assembly, and profile enrichment.
Integrations
- Zapier / Make / n8n — new comment on a keyword → Slack / Discord / Notion
- LangChain / LlamaIndex — RAG over tech discussions and expert commentary
- Vector DBs (Pinecone / Weaviate / Qdrant) — embed comments for semantic tech-discussion search
- Neo4j / Graphiti — commenter → story → topic graph for community analytics
- Sentiment / toxicity APIs — pipe
textinto Perspective API, OpenAI Moderations, or a local classifier - Analytics DBs (BigQuery / Snowflake / DuckDB) — HN comments as columnar tables for trend analysis
- Python SDK
from apify_client import ApifyClientclient = ApifyClient("<APIFY_TOKEN>")run = client.actor("Helpermara/hackernews-comments-scraper").call(run_input={"scrapeTopStories": True, "maxStories": 5, "maxCommentsPerStory": 100, "includeUserProfiles": False})for c in client.dataset(run["defaultDatasetId"]).iterate_items():print(c["storyTitle"], "—", c["author"], ":", c["text"][:80])
FAQ
Do I need an HN API key? No. The HN Firebase API (hacker-news.firebaseio.com/v0) is public and unauthenticated.
How fresh is the data? Real-time. HN publishes updates to Firebase within seconds of user actions — you always get the current thread state.
How do I find a story ID? From the URL: news.ycombinator.com/item?id=41009141 → ID is 41009141. Pass in storyIds or use storyUrls directly.
Does it handle deleted / dead comments? Yes — the API flags them (dead, deleted). The actor returns them with the flag so you can filter downstream.
What about Ask HN / Show HN threads? Those are regular HN stories — scrape them the same way. The storyTitle lets you filter by prefix (Ask HN:, Show HN:).
Differences vs the main hackernews-scraper actor? This one focuses on comments (nested threads, profiles). The other focuses on stories (titles, URLs, points, ranking). Use together for full HN coverage.
Rate limits? The Firebase API is generous and CDN-backed. The actor paces at 50 ms between requests to stay polite and fast.
Keywords
hacker news scraper, hn scraper, hn comments, hacker news api, hn threads, tech discussions, yc news, ycombinator scraper, startup intelligence, launch feedback, tech sentiment, developer community, hn users, hn karma, nested comments, firebase hn api, tech trends, show hn, ask hn, tech discussion mining
Companions (cross-promo)
- hacker-news-scraper — HN stories (titles, URLs, points)
- lobsters-scraper — Lobsters posts (tech discussion)
- producthunt-scraper — daily tech launches
- devto-article-scraper — DEV.to articles
Changelog
- 2026-04-24 — Extended README with use cases, integrations, and FAQ
- 2026-03 — Initial release: story IDs / URLs / top / new stories + optional user profiles