Hacker News Data Scraper
Pricing
Pay per usage
Hacker News Data Scraper
Unlock the pulse of the tech world by scraping Hacker News effortlessly. Extract top stories, comments, and jobs from Y Combinator's platform. Perfect for market research, sentiment analysis, and staying ahead of startup trends with fast, structured data.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Shahid Irfan
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Extract comprehensive data from Hacker News using the official API. Collect stories, comments, and job postings from different categories including top stories, new stories, best stories, Ask HN, Show HN, and job listings. Perfect for monitoring trends, analyzing community engagement, and building datasets for research.
Features
- Complete Story Data — Extract titles, scores, comments, and metadata
- Multiple Categories — Collect from top, new, best, ask, show, and job stories
- Fast API Extraction — Direct access to official Hacker News data
- Structured JSON Output — Consistent format for all data types
- Rate Limit Respect — Built-in delays for responsible data collection
Use Cases
Community Research
Analyze trending topics and user engagement patterns on Hacker News. Understand what content resonates with the tech community and track discussion trends over time.
Job Market Intelligence
Monitor startup job postings and career opportunities. Track hiring trends across different tech companies and identify emerging roles in the industry.
Content Analysis
Build comprehensive datasets for machine learning and natural language processing. Study user behavior, content patterns, and community dynamics.
News Monitoring
Stay updated on the latest tech news and discussions. Automatically collect and analyze stories that matter to your research or business.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
storyType | String | No | topstories | Type of stories to collect: topstories, newstories, beststories, askstories, showstories, jobstories |
results_wanted | Integer | No | 20 | Maximum number of stories to collect (1-500) |
proxyConfiguration | Object | No | {"useApifyProxy": false} | Proxy settings (optional for HN API) |
Output Data
Each item in the dataset contains:
| Field | Type | Description |
|---|---|---|
id | Integer | Unique story ID |
type | String | Item type (story, comment, job, etc.) |
title | String | Story title |
by | String | Author username |
score | Integer | Story score/upvotes |
descendants | Integer | Number of comments |
time | Integer | Unix timestamp |
timestamp | String | ISO 8601 timestamp |
url | String | Original story URL (if external) |
text | String | Story text content (HTML format) |
text_clean | String | Story text content (clean text format) |
hn_url | String | Hacker News discussion URL |
kids | Array | Comment IDs |
deleted | Boolean | Whether the item is deleted |
dead | Boolean | Whether the item is dead |
parent | Integer | Parent item ID (for comments) |
poll | Integer | Associated poll ID (for poll options) |
parts | Array | Related poll option IDs (for polls) |
Usage Examples
Collect Top Stories
Extract the most popular stories from Hacker News:
{"storyType": "topstories","results_wanted": 50}
Get New Stories
Collect the latest submissions to Hacker News:
{"storyType": "newstories","results_wanted": 30}
Collect Job Postings
Gather startup job listings from the community:
{"storyType": "jobstories","results_wanted": 100}
Sample Output
{"id": 45006801,"type": "story","title": "Show HN: I built a tool to help developers write better commit messages","by": "developer123","score": 245,"descendants": 67,"time": 1735689600,"timestamp": "2025-01-01T00:00:00.000Z","url": "https://github.com/developer123/commit-helper","text": "<p>A simple tool that analyzes your commit messages and suggests improvements based on conventional commit standards.</p>","text_clean": "A simple tool that analyzes your commit messages and suggests improvements based on conventional commit standards.","hn_url": "https://news.ycombinator.com/item?id=45006801","kids": [45006802, 45006803, 45006804],"deleted": false,"dead": false,"parent": null,"poll": null,"parts": null}
Tips for Best Results
Choose Story Types Wisely
- Use
topstoriesfor trending content and popular discussions - Select
newstoriesfor the latest submissions and fresh content - Pick
jobstoriesfor career opportunities and hiring trends
Optimize Collection Size
- Start with small numbers (20-50) for testing and exploration
- Increase to 100-200 for comprehensive data collection
- Balance between data volume and processing time
Handle Large Datasets
- Export results to JSON or CSV for analysis
- Use filtering and sorting in your analysis tools
- Consider pagination for very large collections
Integrations
Connect your Hacker News data with:
- Google Sheets — Export for collaborative analysis
- Airtable — Build searchable story databases
- Slack — Get notifications for trending stories
- Make — Create automated content workflows
- Zapier — Trigger actions based on story data
Export Formats
Download data in multiple formats:
- JSON — For developers and API integrations
- CSV — For spreadsheet analysis and reporting
- Excel — For business intelligence dashboards
Frequently Asked Questions
What's the difference between story types?
topstories are ranked by score and popularity, newstories by recency, beststories by a special algorithm, while askstories, showstories, and jobstories are specific post types.
Can I collect comments along with stories?
The current version collects story metadata. Comments can be fetched separately using the kids array with additional API calls to the Hacker News API.
Is this using the official API?
Yes, this scraper uses the official Hacker News API provided by Y Combinator, ensuring reliable and compliant data collection.
How many stories can I collect?
You can collect up to 500 stories per run. The API provides access to the most recent and popular content.
What if some fields are empty?
Some fields may be empty depending on the story type. For example, job postings may not have external URLs, and some stories may not have text content.
Support
For issues or feature requests, contact support through the Apify Console.
Resources
Legal Notice
This scraper uses the official Hacker News API and complies with their terms of service. The API is provided by Y Combinator for public use. Users are responsible for ensuring compliance with applicable laws and using data responsibly.