Pricing

Pay per usage

Hacker News Data Scraper

Unlock the pulse of the tech world by scraping Hacker News effortlessly. Extract top stories, comments, and jobs from Y Combinator's platform. Perfect for market research, sentiment analysis, and staying ahead of startup trends with fast, structured data.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Shahid Irfan

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Features

Complete Story Data — Extract titles, scores, comments, and metadata
Multiple Categories — Collect from top, new, best, ask, show, and job stories
Fast API Extraction — Direct access to official Hacker News data
Structured JSON Output — Consistent format for all data types
Rate Limit Respect — Built-in delays for responsible data collection

Use Cases

Community Research

Analyze trending topics and user engagement patterns on Hacker News. Understand what content resonates with the tech community and track discussion trends over time.

Job Market Intelligence

Monitor startup job postings and career opportunities. Track hiring trends across different tech companies and identify emerging roles in the industry.

Content Analysis

Build comprehensive datasets for machine learning and natural language processing. Study user behavior, content patterns, and community dynamics.

News Monitoring

Stay updated on the latest tech news and discussions. Automatically collect and analyze stories that matter to your research or business.

Input Parameters

Parameter	Type	Required	Default	Description
`storyType`	String	No	`topstories`	Type of stories to collect: `topstories`, `newstories`, `beststories`, `askstories`, `showstories`, `jobstories`
`results_wanted`	Integer	No	`20`	Maximum number of stories to collect (1-500)
`proxyConfiguration`	Object	No	`{"useApifyProxy": false}`	Proxy settings (optional for HN API)

Output Data

Each item in the dataset contains:

Field	Type	Description
`id`	Integer	Unique story ID
`type`	String	Item type (`story`, `comment`, `job`, etc.)
`title`	String	Story title
`by`	String	Author username
`score`	Integer	Story score/upvotes
`descendants`	Integer	Number of comments
`time`	Integer	Unix timestamp
`timestamp`	String	ISO 8601 timestamp
`url`	String	Original story URL (if external)
`text`	String	Story text content (HTML format)
`text_clean`	String	Story text content (clean text format)
`hn_url`	String	Hacker News discussion URL
`kids`	Array	Comment IDs
`deleted`	Boolean	Whether the item is deleted
`dead`	Boolean	Whether the item is dead
`parent`	Integer	Parent item ID (for comments)
`poll`	Integer	Associated poll ID (for poll options)
`parts`	Array	Related poll option IDs (for polls)

Usage Examples

Collect Top Stories

Extract the most popular stories from Hacker News:

{
  "storyType": "topstories",
  "results_wanted": 50
}

Get New Stories

Collect the latest submissions to Hacker News:

{
  "storyType": "newstories",
  "results_wanted": 30
}

Collect Job Postings

Gather startup job listings from the community:

{
  "storyType": "jobstories",
  "results_wanted": 100
}

Sample Output

{
  "id": 45006801,
  "type": "story",
  "title": "Show HN: I built a tool to help developers write better commit messages",
  "by": "developer123",
  "score": 245,
  "descendants": 67,
  "time": 1735689600,
  "timestamp": "2025-01-01T00:00:00.000Z",
  "url": "https://github.com/developer123/commit-helper",
  "text": "<p>A simple tool that analyzes your commit messages and suggests improvements based on conventional commit standards.</p>",
  "text_clean": "A simple tool that analyzes your commit messages and suggests improvements based on conventional commit standards.",
  "hn_url": "https://news.ycombinator.com/item?id=45006801",
  "kids": [45006802, 45006803, 45006804],
  "deleted": false,
  "dead": false,
  "parent": null,
  "poll": null,
  "parts": null
}

Tips for Best Results

Choose Story Types Wisely

Use topstories for trending content and popular discussions
Select newstories for the latest submissions and fresh content
Pick jobstories for career opportunities and hiring trends

Optimize Collection Size

Start with small numbers (20-50) for testing and exploration
Increase to 100-200 for comprehensive data collection
Balance between data volume and processing time

Handle Large Datasets

Export results to JSON or CSV for analysis
Use filtering and sorting in your analysis tools
Consider pagination for very large collections

Integrations

Connect your Hacker News data with:

Google Sheets — Export for collaborative analysis
Airtable — Build searchable story databases
Slack — Get notifications for trending stories
Make — Create automated content workflows
Zapier — Trigger actions based on story data

Export Formats

Download data in multiple formats:

JSON — For developers and API integrations
CSV — For spreadsheet analysis and reporting
Excel — For business intelligence dashboards

Frequently Asked Questions

What's the difference between story types?

topstories are ranked by score and popularity, newstories by recency, beststories by a special algorithm, while askstories, showstories, and jobstories are specific post types.

Can I collect comments along with stories?

The current version collects story metadata. Comments can be fetched separately using the kids array with additional API calls to the Hacker News API.

Is this using the official API?

Yes, this scraper uses the official Hacker News API provided by Y Combinator, ensuring reliable and compliant data collection.

How many stories can I collect?

You can collect up to 500 stories per run. The API provides access to the most recent and popular content.

What if some fields are empty?

Some fields may be empty depending on the story type. For example, job postings may not have external URLs, and some stories may not have text content.

Support

For issues or feature requests, contact support through the Apify Console.

Resources

Legal Notice

This scraper uses the official Hacker News API and complies with their terms of service. The API is provided by Y Combinator for public use. Users are responsible for ensuring compliance with applicable laws and using data responsibly.

Hacker News Live Feed

desmond-dev/hacker-news-tech-trends

Real-time top stories from Hacker News (Y Combinator). Fetches title, URL, score, and comments. Perfect for tracking tech trends, AI news, and startup buzz.

Desmond Chigariro

Hacker News Scraper

automation-lab/hackernews-scraper

Scrape stories from Hacker News — titles, points, authors, comments, and links.

Stas Persiianenko

Hackernews Intelligence

viralanalyzer/hackernews-intelligence

Scrape Hacker News stories, comments, and discussions. Track tech trends, startup news, and developer community sentiment.

viralanalyzer

5.0

Hacker News Scraper

muscular_quadruplet/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles. Extract top stories, new posts, Show HN, Ask HN. Monitor tech trends, track discussions, build news aggregators. Real-time tech news scraping.

Do It

Hackernews Scraper

pear_fight/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles. Extract top/new/best stories with scores, comments, and links. Ideal for tech news monitoring, startup trend analysis, and developer community insights.

Harald

Hacker News Scraper & API - Export Stories, Comments, Data

fresh_cliff/hackernews-scraper

Extract top stories, trending posts, points, comments & authors from Hacker News front page. Real-time data export to JSON/CSV. Monitor tech trends, analyze viral content, track HN activity. Fast Playwright scraper.

Brennan Crawford

Hacker News Scraper

cloud9_ai/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles via official Firebase API. Get top, new, best, ask, show stories with scores, comments, and author data.

cloud9

Hacker News API Scraper

andok/hackernews-scraper

Fetch stories directly from the official Hacker News Firebase API instantly.

Andok

Hacker News Scraper

optimus-fulcria/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles. Top/new/best/ask/show/job stories, Algolia search, nested comments.

Fulcria Labs