Hacker News Scraper
Pricing
Pay per event
Hacker News Scraper
Scrape Hacker News stories from any section: front page, newest, Ask HN, Show HN, jobs, and best. Extracts titles, URLs, points, authors, comment counts, and post ages. Great for tech trend detection, content curation, and newsletter creation. Export to JSON, CSV, or Excel.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Actor stats
0
Bookmarked
4
Total users
3
Monthly active users
an hour ago
Last modified
Categories
Share
Scrape stories from Hacker News. Get titles, points, authors, comment counts, and links from the front page, newest stories, Ask HN, Show HN, jobs, and best stories.
What does Hacker News Scraper do?
Hacker News Scraper extracts story data from any section of Hacker News. It collects the title, URL, domain, points (upvotes), author, age, comment count, and a direct link to the comments page.
It supports all major HN sections and paginated results.
Why scrape Hacker News?
Hacker News is the most influential tech community on the web, with millions of developers, founders, and investors reading it daily. Stories that trend on HN can drive massive traffic and shape industry conversations.
Key reasons to scrape it:
- Trend detection — Spot emerging technologies and hot topics in tech
- Content curation — Build newsletters, dashboards, or feeds from HN
- Sentiment analysis — Track community interest via points and comments
- Competitive intelligence — Monitor when competitors or products are discussed
- Research — Study tech community behavior and content patterns
- AI training data — Build structured datasets of tech discussions for LLM training, RAG pipelines, or AI agent knowledge bases
Use cases
- Newsletter creators curating the top tech stories weekly
- Startup founders monitoring mentions of their products
- Data scientists building tech trend datasets
- Developers creating custom HN dashboards and alerts
- Investors tracking what technologies the community is excited about
- Researchers studying tech community dynamics
- AI/ML engineers building training datasets or RAG pipelines with curated tech discussions
How to scrape Hacker News
- Go to Hacker News Scraper on Apify Store
- Choose a section (front page, newest, Ask HN, Show HN, jobs, or best)
- Set the number of pages and max stories
- Click Start and wait for results
- Download data as JSON, CSV, or Excel
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
section | string | "front" | HN section: front, newest, ask, show, jobs, best |
maxPages | integer | 3 | Max pages to scrape (30 stories/page) |
maxStories | integer | 100 | Max total stories to extract |
Input example
{"section": "front","maxPages": 2,"maxStories": 60}
Output
Each story in the dataset contains:
| Field | Type | Description |
|---|---|---|
rank | number | Position on the page |
id | number | Hacker News story ID |
title | string | Story title |
url | string | Link to the article |
domain | string | Source domain (e.g., "github.com") |
points | number | Upvote count |
author | string | Username who submitted the story |
age | string | Time since posting (e.g., "4 hours ago") |
commentCount | number | Number of comments |
commentsUrl | string | Direct link to HN comments page |
scrapedAt | string | ISO timestamp of extraction |
Output example
{"rank": 1,"id": 47225130,"title": "The workers behind Meta's smart glasses can see everything","url": "https://www.svd.se/a/K8nrV4/metas-ai-smart-glasses-and-data-privacy-concerns","domain": "svd.se","points": 619,"author": "sandbach","age": "4 hours ago","commentCount": 343,"commentsUrl": "https://news.ycombinator.com/item?id=47225130","scrapedAt": "2026-03-03T02:54:49.013Z"}
How much does it cost to scrape Hacker News?
Hacker News Scraper uses pay-per-event pricing:
| Event | Price |
|---|---|
| Run started | $0.001 |
| Story extracted | $0.001 per story |
Cost examples
| Scenario | Stories | Cost |
|---|---|---|
| Front page (1 page) | 30 | $0.031 |
| Front page (3 pages) | 90 | $0.091 |
| Multiple sections | 150 | $0.151 |
Platform costs are negligible — typically under $0.001 per run.
Using Hacker News Scraper with the Apify API
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('automation-lab/hackernews-scraper').call({section: 'front',maxPages: 1,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Found ${items.length} stories`);items.forEach(story => {console.log(`#${story.rank} ${story.title} (${story.points} pts, ${story.commentCount} comments)`);});
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_API_TOKEN')run = client.actor('automation-lab/hackernews-scraper').call(run_input={'section': 'front','maxPages': 1,})dataset = client.dataset(run['defaultDatasetId']).list_items().itemsprint(f'Found {len(dataset)} stories')for story in dataset:print(f"#{story['rank']} {story['title']} ({story['points']} pts, {story['commentCount']} comments)")
Use with Claude AI (MCP)
This actor is available as a tool in Claude AI through the Model Context Protocol (MCP). Add it to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.
Setup for Claude Code
$claude mcp add --transport http apify "https://mcp.apify.com"
Setup for Claude Desktop, Cursor, or VS Code
Add this to your MCP config file:
{"mcpServers": {"apify": {"url": "https://mcp.apify.com"}}}
Example prompts
- "What are the top stories on Hacker News right now?"
- "Get this week's best Hacker News stories about AI and machine learning"
- "Show me the latest Ask HN posts and summarize the most discussed ones"
Learn more in the Apify MCP documentation.
Integrations
Hacker News Scraper works with all Apify integrations:
- Scheduled runs — Scrape HN every hour or daily for automated monitoring
- Webhooks — Get notified when a scrape completes
- API — Trigger runs and fetch results programmatically
- Google Sheets — Export stories to a spreadsheet for analysis
- Slack — Post top stories to your team's Slack channel
Connect to Zapier, Make, or Google Sheets for automated workflows.
Tips
- Use "best" section to get all-time popular stories
- Schedule hourly runs to track how stories rise and fall on the front page
- Compare points and comments to gauge engagement levels
- Filter by domain in post-processing to find stories from specific sources
- The "newest" section is great for catching stories before they trend
- Job stories (section:
jobs) have no points or comments — they're job postings
Legality
Scraping publicly available data is generally legal according to the US Court of Appeals ruling (HiQ Labs v. LinkedIn). This actor only accesses publicly available information and does not require authentication. Always review and comply with the target website's Terms of Service before scraping. For personal data, ensure compliance with GDPR, CCPA, and other applicable privacy regulations.
FAQ
How many stories does each page have?
Each HN page shows 30 stories. With maxPages: 10, you can get up to 300 stories per section.
Can I search for specific topics? HN doesn't have a built-in search on its main pages. For topic search, consider using the Algolia HN Search API separately.
How often does HN update? The front page changes continuously. Running every 15-60 minutes captures most significant changes.
What about job stories?
Job postings (section: jobs) don't have points or comments. The author field will be the company posting the job.
Is it legal to scrape Hacker News?
Hacker News is publicly accessible and does not require authentication. This scraper collects only publicly visible story metadata (titles, points, authors, links). It does not bypass any login walls or access private data. Always check the Hacker News guidelines and respect reasonable rate limits.
Some stories have 0 points and no author — is this a bug?
No. Job postings on HN (section: jobs) don't have points, authors, or comments. This is normal HN behavior.
The scraper returns duplicate stories across runs.
HN stories stay on the front page for hours. If you run the scraper frequently, overlap is expected. Deduplicate by the id field in post-processing.
Other developer tools
- GitHub Scraper — Repositories, profiles, trending, and search results from GitHub
- GitHub Trending Scraper — Trending repositories from GitHub with star velocity
- Homebrew Scraper — Homebrew formulas and casks with install counts
- Stack Overflow Scraper — Questions, answers, and tags from Stack Overflow
- npm Scraper — Package metadata from the npm registry
- PyPI Scraper — Python package data from PyPI
- Crates Scraper — Rust crate metadata from crates.io
Other news scrapers
- Dev.to Scraper — Articles and posts from Dev.to
- TechCrunch Scraper — Articles from TechCrunch
- Lobsters Scraper — Stories from Lobsters
- Substack Scraper — Newsletter posts and comments from Substack