Hacker News Scraper — Stories, Comments & Jobs
Pricing
$4.99/month + usage
Hacker News Scraper — Stories, Comments & Jobs
Hacker News scraper 2026 — extract posts, comments, karma and user profiles without API key. Pay-per-result pricing. Returns structured JSON. Perfect for tech trend monitoring, brand mentions and developer community research.
Pricing
$4.99/month + usage
Rating
0.0
(0)
Developer
Web Data Labs
Actor stats
0
Bookmarked
6
Total users
2
Monthly active users
a day ago
Last modified
Categories
Share
Hacker News Scraper — Extract Stories, Comments & User Profiles from HN
Scrape Hacker News stories, comments, and user profiles. Browse top, new, best, Ask HN, Show HN, and job stories. Search by keyword with date filtering. Get full comment trees. Fetch user profiles with karma and activity data.
Why Use This Hacker News Scraper?
Hacker News is the internet's most influential tech community — where startups launch, trends emerge, and technical discussions shape the industry. While HN has a public API, building a reliable scraper around it requires handling concurrency, pagination, rate limits, and data formatting.
This actor handles all of that for you:
- All HN categories — Top, New, Best, Ask HN, Show HN, Jobs
- Full-text search with date range filtering (via Algolia)
- Comment trees — fetch full discussion threads with nested replies
- User profiles — karma, account age, submission count
- Filters — minimum score, minimum comments, keyword matching
- Concurrent fetching — fast parallel requests for large datasets
- Clean JSON output — structured, consistent, ready to use
Use Cases
1. Tech Trend Monitoring
Track emerging technologies, frameworks, and tools by monitoring what gets upvoted on HN. Spot trends weeks before they hit mainstream tech media.
2. Startup & Product Launch Tracking
Monitor "Show HN" and "Launch HN" posts to discover new startups and products the moment they're announced to the tech community.
3. Competitive Intelligence
Search for mentions of your competitors, their products, or your market category. Analyze how the tech community perceives them.
4. Content Research & Curation
Find the most discussed and upvoted content in your niche. Build newsletters, blog post ideas, or social media content based on what resonates with the HN audience.
5. Recruiting & Talent Intelligence
Monitor "Who is Hiring?" threads and job posts. Analyze hiring trends, popular technologies in job listings, and salary discussions.
6. Academic Research
Build datasets of tech community discussions for research on information diffusion, opinion formation, or technology adoption patterns.
7. Investment & Market Analysis
Track discussions about specific companies, funding rounds, or market segments. HN comments often contain insider perspectives and technical due diligence.
8. Developer Relations & DevTool Marketing
Understand how developers discuss tools and frameworks. Find pain points, feature requests, and sentiment about your developer product.
9. Technical Due Diligence
Research specific technologies or companies by searching HN discussions. Get candid technical opinions from experienced practitioners.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
category | string | No | top | Story category: top, new, best, ask, show, jobs, search |
searchQuery | string | No | — | Search query (only when category is search) |
sortBy | string | No | relevance | Search sort: relevance or date |
since | string | No | — | Start date filter for search (YYYY-MM-DD) |
until | string | No | — | End date filter for search (YYYY-MM-DD) |
storyIds | array | No | — | Fetch specific stories by HN ID (overrides category) |
maxItems | integer | No | 100 | Maximum stories to return (1–5000) |
minScore | integer | No | 0 | Minimum upvote score filter |
minComments | integer | No | 0 | Minimum comment count filter |
keyword | string | No | — | Filter stories by title keyword (case-insensitive) |
includeComments | boolean | No | false | Fetch full comment trees for each story |
maxCommentsPerStory | integer | No | 50 | Max comments per story (1–1000) |
scrapeType | string | No | stories | Output type: stories, users, or both |
Example Input
Get Top Stories
{"category": "top","maxItems": 30,"minScore": 100}
Search with Date Range
{"category": "search","searchQuery": "AI agents","sortBy": "date","since": "2026-03-01","until": "2026-03-31","maxItems": 200}
Get Story with Comments
{"storyIds": ["39876543"],"includeComments": true,"maxCommentsPerStory": 100}
Show HN Posts This Week
{"category": "show","maxItems": 50,"minScore": 10}
Fetch User Profiles
{"category": "top","maxItems": 100,"scrapeType": "users"}
Sample Output
Story
{"id": 39876543,"title": "Show HN: I built an open-source alternative to Notion","url": "https://example.com/my-project","domain": "example.com","text": null,"author": "builder_dev","score": 847,"commentCount": 234,"createdAt": "2026-03-15T10:30:00.000Z","createdAtUnix": 1773835800,"storyType": "show","hnUrl": "https://news.ycombinator.com/item?id=39876543","dead": false,"kids": [39876600, 39876601, 39876602]}
Comment
{"id": 39876600,"text": "This looks great. How does it handle real-time collaboration?","author": "curious_dev","parentId": 39876543,"storyId": 39876543,"createdAt": "2026-03-15T10:45:00.000Z","depth": 0,"kids": [39876610, 39876611]}
User Profile
{"username": "builder_dev","karma": 12847,"about": "Building tools for developers. Previously at BigCorp.","createdAt": "2019-06-15T00:00:00.000Z","createdAtUnix": 1560556800,"submittedCount": 342,"profileUrl": "https://news.ycombinator.com/user?id=builder_dev"}
Integration Examples
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run_input = {"category": "search","searchQuery": "machine learning","sortBy": "date","since": "2026-03-01","maxItems": 100,"minScore": 50}run = client.actor("cryptosignals/hackernews-scraper").call(run_input=run_input)for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"[{item['score']}] {item['title']} — {item.get('domain', 'self')}")
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const input = {category: "top",maxItems: 50,minScore: 100,includeComments: true,maxCommentsPerStory: 20};const run = await client.actor("cryptosignals/hackernews-scraper").call(input);const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => {if (item.title) {console.log(`[${item.score}] ${item.title} (${item.commentCount} comments)`);}});
Automated Daily Digest
Build a daily HN digest delivered to your inbox or Slack:
- Go to the actor page and click Schedule
- Set it to run daily at your preferred time
- Configure:
category: "top",maxItems: 20,minScore: 200 - Add a webhook to send results to Slack, email, or your CMS
Pricing & Cost Estimates
This actor uses Apify's pay-per-event (PPE) pricing — pay only for what you scrape.
| Use Case | Items | Estimated Cost |
|---|---|---|
| Daily top stories | 30 stories | ~$0.30 |
| Weekly trend search | 200 stories | ~$2.00 |
| Deep research with comments | 50 stories + comments | ~$3.00 |
| User profile batch | 100 users | ~$1.00 |
Free tier: Apify provides $5 in free monthly credits — enough for daily HN monitoring at no cost.
Tips to minimize costs:
- Use
minScoreandminCommentsfilters to skip low-value stories - Set
includeCommentstofalseif you only need story metadata - Limit
maxCommentsPerStoryto reduce data volume
Frequently Asked Questions
Does this scraper require a Hacker News account?
No. All data is fetched from HN's public Firebase API and Algolia search API. No authentication needed.
How is search different from category browsing?
Category browsing (top, new, best, etc.) fetches HN's official ranked lists. Search mode uses Algolia's full-text index, which supports keyword queries and date range filtering.
Can I get historical data?
Yes. Use search mode with since and until date parameters to fetch stories from any time period. Algolia indexes HN stories going back years.
How do I fetch comments for a specific story?
Provide the story ID in the storyIds array and set includeComments to true. Comments are returned as a flat list with depth and parentId fields to reconstruct the tree.
What's the difference between stories, users, and both scrape types?
storiesreturns story/post data only (default)usersfetches profile data for the authors of matched storiesbothreturns stories AND their authors' profiles in the same dataset
Can I filter stories by domain or source?
Not directly as an input parameter, but you can use the keyword filter on titles, or post-process results by filtering on the domain field in the output.
Is there a maximum number of stories I can fetch?
Up to 5,000 stories per run. For category browsing, HN's API returns a maximum of 500 story IDs per category. For search, Algolia supports deep pagination.
Can I export data as CSV?
Yes. All results are stored in Apify datasets, which support one-click export to JSON, CSV, Excel, XML, and RSS formats.
How fast is this scraper?
It uses concurrent fetching (20 parallel requests) for story details. A typical run of 100 stories completes in under 30 seconds. Adding comments increases runtime proportionally.
Can I use this for real-time monitoring?
Yes. Schedule the actor to run every hour (or more frequently) with specific search terms or categories. Combine with webhooks to get instant notifications.
Support & Updates
This actor is actively maintained and updated. Need help?
- Report issues via the Issues tab on this actor's page
- Request features by leaving a comment
- Star this actor to help others find it
Built and maintained by cryptosignals on Apify.
⭐ Found this useful?
If this actor saved you time, please leave a review on the Apify Store — it takes 30 seconds and helps other developers find it.
Questions or issues? Drop a comment below and I'll respond within 24 hours.