Pricing

from $2.99 / 1,000 results

Hacker News Data Scraper

Hacker News scraper that pulls stories, jobs, Ask HN and Show HN posts from news.ycombinator.com, so developers and SEO teams can track tech trends and job listings without manual browsing.

Pricing

from $2.99 / 1,000 results

Rating

0.0

(0)

Developer

Kawsar

Actor stats

Bookmarked

Total users

Monthly active users

21 days ago

Last modified

Hacker News Data Scraper: extract stories, jobs, and posts from news.ycombinator.com

Pulls structured data from news.ycombinator.com. Covers all six feeds (top, new, best, ask, show, and jobs), returns post titles, URLs, points, authors, comment counts, and post types, and pages through automatically until you hit your item limit. Works with any HN feed URL — paste a URL like https://news.ycombinator.com/show?p=5 into Start URLs and it will paginate forward from that page.

What data does this actor return?

Field	Type	Description	Example
`itemId`	integer	Hacker News item ID	48031684
`rank`	integer	Position in the feed	1
`storyTitle`	string	Post title	"Agents can now create Cloudflare accounts"
`url`	string	Linked URL (internal HN link for Ask/Show)	https://blog.cloudflare.com/...
`domain`	string	Domain extracted from the linked URL	cloudflare.com
`points`	integer\|null	Upvote score (null for job posts)	200
`author`	string\|null	Submitter username (null for job posts)	rolph
`commentCount`	integer\|null	Number of comments (null for job posts)	108
`commentsUrl`	string\|null	HN discussion thread URL	https://news.ycombinator.com/item?id=...
`age`	string	Post age as displayed on HN	3 hours ago
`postType`	string	One of: story, job, ask, show, launch	story
`scrapedAt`	string	ISO 8601 UTC timestamp	2026-05-06T10:00:00+00:00

How to use

Option 1: Scrape a feed

Open the input tab
Pick a feed type: top, new, best, ask, show, or jobs
Set your item limit (up to 1000)
Click Run

The actor pages through HN automatically (30 items per page) until it hits your limit.

Option 2: Start from a specific page

Add any HN feed URL to the Start URLs field. The actor detects the page number from the URL and paginates forward from there.

Examples:

https://news.ycombinator.com/show?p=3 — starts at Show HN page 3 and pages forward
https://news.ycombinator.com/newest — scrapes the New feed from page 1
https://news.ycombinator.com/ask?p=10 — starts at Ask HN page 10

Multiple URLs are supported. The actor processes each in order and stops when it hits your item limit.

Input

Parameter	Type	Default	Description
`feedType`	string (select)	`top`	Feed to scrape when no Start URLs are set
`startUrls`	array of strings	`[]`	HN URLs to start paginating from. Overrides Feed type.
`maxItems`	integer	`100`	Max items to collect per run (up to 1000)
`requestTimeoutSecs`	integer	`30`	Per-request timeout in seconds

Feed type options

Value	URL	Description
`top`	news.ycombinator.com/	Front page — highest-voted recent stories
`new`	news.ycombinator.com/newest	Newest submissions, unfiltered
`best`	news.ycombinator.com/best	Highest-voted of all time
`ask`	news.ycombinator.com/ask	Ask HN posts only
`show`	news.ycombinator.com/show	Show HN and Launch HN posts only
`jobs`	news.ycombinator.com/jobs	YC startup job listings

Example output

[
  {
    "itemId": 48031684,
    "rank": 1,
    "storyTitle": "Agents can now create Cloudflare accounts, buy domains, and deploy products",
    "url": "https://blog.cloudflare.com/agents-stripe-projects/",
    "domain": "cloudflare.com",
    "points": 200,
    "author": "rolph",
    "commentCount": 108,
    "commentsUrl": "https://news.ycombinator.com/item?id=48031684",
    "age": "3 hours ago",
    "postType": "story",
    "scrapedAt": "2026-05-06T10:00:00.000000+00:00"
  },
  {
    "itemId": 48025244,
    "rank": 1,
    "storyTitle": "Proliferate (YC S25) Is Hiring",
    "url": "https://www.ycombinator.com/companies/proliferate/jobs/...",
    "domain": "ycombinator.com",
    "points": null,
    "author": null,
    "commentCount": null,
    "commentsUrl": "https://news.ycombinator.com/item?id=48025244",
    "age": "13 hours ago",
    "postType": "job",
    "scrapedAt": "2026-05-06T10:00:00.000000+00:00"
  }
]

How pagination works

Each HN feed page returns 30 items. The actor increments the ?p= query parameter and fetches the next page until either your maxItems limit is reached or there are no more items. If you set maxItems to 300, the actor fetches 10 pages automatically.

When you use Start URLs with a page number (e.g. ?p=5), the actor starts at that page and paginates forward — it does not go back to page 1.

Use cases

SEO research: track which tech topics trend on HN and use that to shape your content calendar
Job market monitoring: collect startup listings from the jobs feed and compare them week over week
Show HN and Launch HN watching: see what new products the community pays attention to
Content curation: pull top stories automatically for newsletters or internal feeds
Dataset building: community engagement data (points, comment counts) across thousands of posts over time
Competitive intelligence: monitor mentions of competitor products or technologies in trending discussions

Scheduling

To collect HN data on a recurring schedule, use Apify's built-in scheduler:

Go to your actor page and click Schedules
Set a cron expression (e.g. 0 9 * * * for 9am daily)
Configure the input (feed type, item limit)
Each run's results land in a separate dataset

This works well for building historical trend datasets over days or weeks.

Limitations

Max 1000 items per run (HN has no API rate limit, but this keeps run costs predictable)
Comment content is not extracted — post-level data only
Job posts return null for points, author, and commentCount (HN does not display these for jobs)
HN's "best" feed is relatively small — it may return fewer than 200 unique items before repeating
The age field is a human-readable string from HN ("3 hours ago"), not a parsed timestamp

FAQ

What feeds are supported? Top, new, best, ask, show, and jobs.

How many items can I collect per run? Up to 1000. Each page has 30 items and the actor pages through automatically.

Can I start scraping from a specific page? Yes. Add a URL like https://news.ycombinator.com/show?p=5 to Start URLs. The actor reads the page number from the URL and paginates forward from there.

Can I scrape multiple feeds in one run? Yes. Add multiple feed URLs to Start URLs (e.g. both /show and /ask) and the actor will scrape each in sequence until maxItems is reached.

Does it scrape comments? No. Post-level only: title, URL, points, author, comment count. Comment text is not extracted.

Do job posts include points and author? No. HN job posts do not show vote counts or usernames. Those fields come back null.

How does post type detection work? Title prefix: "Ask HN:" becomes ask, "Show HN:" becomes show, "Launch HN:" becomes launch. Posts from the jobs feed are always tagged job. Everything else is story.

Can I export results to CSV or Excel? Yes. In the Apify dataset view, click Export and choose CSV, Excel, JSON, or JSONL.

Hacker News Scraper

muscular_quadruplet/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles. Extract top stories, new posts, Show HN, Ask HN. Monitor tech trends, track discussions, build news aggregators. Real-time tech news scraping.

Do It

Hacker News Scraper

coder_zoro/hacker-news-scraper

Extract Hacker News stories, Ask HN, Show HN & jobs. Multi-category scraping. Structured JSON output. Fast & reliable. Export to CSV/Excel/JSON.

Zoro

Hacker News MCP Server

automation-lab/hackernews-mcp-server

Query Hacker News data programmatically: search stories, get top posts, Ask HN, Show HN, jobs, comments, and user profiles via the free HN Algolia API.

Stas Persiianenko

Hacker News Search Scraper Stories, Comments, Show HN, Ask HN

seemuapps/hn-search-scraper

Search Hacker News stories, comments, Show HN, Ask HN, polls, and jobs by keyword, author, date range, points, and comment count. Full text and engagement metrics. No login.

Andrew

Hacker News Scraper

nogards95/hacker-news-scraper

Scrape Hacker News stories, comments, jobs, Ask HN, and Show HN using Algolia Search API and HN Firebase API. Supports full-text search, date/points filters, and live feeds.

Nogards

Hacker News Scraper

plantane/hackernews-scraper

Scrape stories, comments, and scores from Hacker News. Supports top, new, best, Ask HN, Show HN, and job feeds. Uses the official Firebase API for reliable, fast data extraction.

Daniel

Hacker News Scraper Pro — Stories, Jobs, Show HN, Ask HN

diverse_venture/hackernews-scraper

Comprehensive Hacker News scraper. Get top/new/best stories, Ask HN, Show HN, Who's Hiring jobs, comments, and search results. Uses the official HN Firebase API + Algolia search — no auth required. Export JSON, CSV, or Excel.

Chak Man Fung

Hacker News Scraper - Stories, Comments & Search

legend006/hackernews-scraper

Scrape Hacker News stories, comments, polls, jobs, and Ask/Show HN posts. Search by keyword and date range, pull a user's full activity, or fetch any HN list (front page, new, best, ask, show, job). No auth required. Built for AI/ML datasets, tech trend research, and news monitoring.

NIJ KANANI

Hacker News Data Scraper

epctex/hackernews-scraper

Extract Y Combinator's Hacker News based on any search criteria. Crawl the front page, Show HN, Ask HN, news, job listings, and historical data. Get links, titles, comments, ratings, and more!

epctex

168

5.0

Hacker News Scraper — Stories, Comments & Users

openclawmara/hacker-news-scraper

Scrape Hacker News stories, comments, and user profiles. Extract trending tech news, top stories by score, new submissions, Ask HN, Show HN, and job posts. Filter by date, score, and comment count. Perfect for tech trend analysis, competitive intelligence, and content curation.