Pricing

from $0.40 / 1,000 data fetcheds

Try for free

Go to Apify Store

Hacker News Scraper

Try for free

Scrapes Hacker News stories, comments, jobs, polls, and user profiles via the official Firebase and Algolia APIs. Supports full-text search, Who's Hiring thread extraction, author karma snapshots, and deep comment trees.

Pricing

from $0.40 / 1,000 data fetcheds

Rating

5.0

(1)

Developer

Omar Eldeeb

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

What does Hacker News Scraper do?

This actor extracts structured data from Hacker News — stories, comments, jobs, user profiles, and the monthly "Ask HN: Who is hiring?" threads — via HN's two official public APIs (the Firebase API and the Algolia search API). It produces a unified JSON output with titles, URLs, points, authors, timestamps, and optional full comment trees with author karma snapshots.

Unlike most HN scrapers on the Store that parse HTML and require proxies, this one hits the official APIs directly. You can schedule it, call it via REST/Webhook/MCP from the Apify platform, pipe its output to downstream actors, and never worry about rate limits, captchas, or IP blocks.

Why use Hacker News Scraper?

🏢 Recruiters & talent teams — pull the latest "Who is hiring?" monthly thread as structured job rows, filter by role keywords, feed into your ATS.
📈 Market intelligence & PR teams — monitor mentions of your company, competitors, or product categories with full-text search.
🧠 AI/LLM data pipelines — build clean, deduplicated training or retrieval-augmented-generation datasets from HN discussions, with author karma signals.
📰 Newsletters & aggregators — daily digest of top / new / best stories above a score threshold, optionally filtered to specific domains.
🎯 Research & academia — reproducible HN corpora for discourse analysis, link-graph research, or longitudinal studies of tech discussion.

How to use Hacker News Scraper

Click Try for free on this actor's page to open the Apify Console input form.
Pick a mode — start with topstories to see how it works.
Set Max items (default 30). For a first run, try 10.
Click Save & Start.
When the run finishes, open the Output tab to see the scraped items, or use Export to download as JSON, CSV, or Excel.
For more advanced runs, enable Include comments to attach comment trees, or use mode=search to run a keyword query.

Input

Every field is configurable via the input form. See the Input tab for live validation and tooltips. The only required field is mode.

Minimal examples (copy into the Console's "JSON" tab):

Top 30 front-page stories:

{ "mode": "topstories", "maxItems": 30 }

Stories mentioning "Claude", newest first, ≥ 50 points:

{
    "mode": "search",
    "searchQuery": "Claude",
    "sortSearchBy": "date",
    "minScore": 50,
    "maxItems": 100
}

Show HN from github.com or arxiv.org with heavy discussion:

{
    "mode": "showstories",
    "minScore": 50,
    "minComments": 20,
    "domainFilter": ["github.com", "arxiv.org"],
    "maxItems": 50
}

Latest "Who is hiring?" thread as structured job rows:

{ "mode": "hiring_threads", "maxItems": 500 }

A story + its full 3-level comment tree + author karma:

{
    "mode": "topstories",
    "maxItems": 5,
    "includeComments": true,
    "maxCommentDepth": 3,
    "includeUserProfiles": true
}

Output

Every row — whether it's a story, comment, job, or poll — uses the same unified shape. You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

{
    "id": 47822805,
    "type": "story",
    "title": "SPEAKE(a)R: Turn Speakers to Microphones for Fun and Profit",
    "url": "https://www.usenix.org/system/files/conference/woot17/woot17-paper-guri.pdf",
    "domain": "usenix.org",
    "hnUrl": "https://news.ycombinator.com/item?id=47822805",
    "by": "Eridanus2",
    "byUserKarma": 1847,
    "byUserCreated": 1625097600,
    "score": 67,
    "time": 1776588348,
    "createdAt": "2026-04-19T08:45:48.000Z",
    "descendants": 26,
    "comments": [
        {
            "id": 47823010,
            "by": "someuser",
            "text": "<p>Interesting approach...</p>",
            "depth": 1,
            "replies": [{ "id": 47823201, "by": "replyer", "text": "...", "depth": 2 }]
        }
    ],
    "scrapedAt": "2026-04-19T11:21:58.601Z",
    "source": "firebase"
}

Data fields

Field	Type	Description
`id`	number	Unique HN item ID
`type`	enum	`story` / `comment` / `job` / `poll` / `pollopt`
`title`	string	Story / job title (null for comments)
`url`	string	Outbound URL for stories (null for text-only)
`domain`	string	Extracted hostname (e.g., `github.com`)
`hnUrl`	string	Deep link to the item on news.ycombinator.com
`by`	string	Author username
`byUserKarma`	number	Author's karma (when `includeUserProfiles=true`)
`byUserCreated`	number	Unix timestamp of author account creation
`score`	number	Points (votes)
`time`	number	Unix timestamp of submission
`createdAt`	string	ISO 8601 timestamp
`text`	string	HTML body (comments, Ask HN, job descriptions)
`descendants`	number	Total comments on a story
`parent`	number	Parent ID for comments
`comments`	array	Nested comment tree (when `includeComments=true`)
`flatComments`	array	Flat list with depth (when `flattenComments=true`)
`deleted` / `dead`	boolean	Item status flags
`scrapedAt`	string	ISO timestamp when this row was fetched
`source`	enum	`firebase` or `algolia` — which API returned it

How much does it cost to scrape Hacker News?

This actor uses pay-per-event pricing — you only pay for the rows you get, no monthly subscription.

Event	Price
`story-fetched`	$0.00040 / item ($0.40 per 1,000 stories or jobs)
`comment-fetched`	$0.00015 / comment ($0.15 per 1,000 comments)
`user-profile-fetched`	$0.00030 / profile ($0.30 per 1,000 profiles — only when `includeUserProfiles=true`)

The first 50 chargeable events in every run are free. That means any run with ≤ 50 total output rows + comments + profiles costs nothing beyond the platform's trivial startup fee. You only pay for events 51+ within the same run.

Typical run costs (after the 50-event trial is exhausted in the same run):

100 top stories (no comments): 100 stories → 50 free + 50 paid = $0.020
30 top stories + 10 comments each (≈ 330 events): **$0.046**
Monthly "Who is hiring?" full extract (≈ 500 jobs): $0.180
1,000 search hits on a keyword: $0.380
A 30-story smoke test with no comments: $0 (entirely free)

Tips & advanced options

Comment depth — maxCommentDepth=3 is a good default. 5+ is only useful for megathreads; 1 gives you only top-level replies.
Domain filter — combine with topstories or showstories for a recruiter-style feed of GitHub / Arxiv / company-blog stories.
Date ranges — pass dateFromUnix / dateToUnix (Unix seconds) to scope search mode. Works best with sortSearchBy=date.
User karma snapshot — includeUserProfiles=true adds byUserKarma + byUserCreated to every row. Useful for weighting by poster reputation.
Flatten comments for spreadsheets — set flattenComments=true to get a flat array with a depth field, which exports cleanly to CSV/Excel.
Schedule it — use Apify's scheduler to run topstories every hour for a live HN feed, or hiring_threads monthly.
Chain it — pipe the output into another actor (e.g., a text classifier, Slack poster) via Apify's integrations.

FAQ, disclaimers & support

Does this need a proxy? No. Both HN APIs are public and unmetered.

How fresh is the data? Firebase list endpoints update roughly every 5 minutes (HN's own cadence). user and hiring_threads are live.

What counts as a "comment" for billing? Only comments that actually appear in your output dataset — i.e. inside the depth limit you set. Comments skipped because they're deleted or beyond maxCommentDepth are free.

Can I search for comments by a specific author? Yes — set mode=search with searchTags: ["comment", "author_dang"].

Why are some by fields null? Deleted items have no author. They're filtered out by default.

Legality. Hacker News data is publicly accessible and both APIs are officially sanctioned by Y Combinator. This actor respects those APIs' rate limits and does not bypass any access control. You are responsible for compliant use of the data under Y Combinator's Terms of Service and any applicable privacy laws (GDPR, CCPA, etc.) when processing personal data such as usernames.

Found a bug or want a feature? Open an issue on the actor's Issues tab. Custom extensions (e.g., Slack / Discord forwarding, semantic dedupe, dashboards) are available on request.

Hacker News Scraper - Stories, Comments, Polls & Users

eccentric_layout/hacker-news-scraper

Scrape Hacker News without an API key: full-text search, stories, comment trees, polls, and user profiles via the official Algolia HN Search and Firebase APIs. Export JSON/CSV/Excel.

Shahryar

Hacker News Scraper — Stories, Jobs, Comments & Users API

bovi/hacker-news-scraper

Scrape Hacker News stories, comments, jobs, and user profiles via the official Firebase and Algolia APIs. No proxy, no auth. Supports top/new/best/ask/show/job feeds, full-text search, comment trees, and user data. Pay per result.

Vitalii Bondarev

Hacker News Scraper: Stories, Comments, Users & Search

perconey/hackernews-scraper

Scrape Hacker News via the official Firebase API + Algolia search. Top/new/best/ask/show/jobs stories, full comment trees, user profiles with karma, free-text search. No browser, no proxies, no auth. Pay only per result item.

Perconey

Hacker News Scraper - Stories, Comments, Jobs, Users

piposlab/hacker-news-scraper

Scrape Hacker News via official APIs: top/new/best/Ask/Show/Jobs lists, full-text search, comment trees and user profiles. No API key.

Alejandro Bufarini

Hacker News Scraper – Stories, Search & Jobs

your_scraper_guy/hacker-news-scraper-stories-search-jobs

Scrape Hacker News stories, comments, and the monthly Who is hiring? thread. Full-text keyword search with filters. No proxy needed, uses HN's official APIs.

Code With Aqib

Hacker News Scraper

moving_beacon-owner1/my-actor-76

A production-ready Apify Actor that scrapes Hacker News stories, comments, user profiles, and search results using the official Firebase API and Algolia HN Search API.

Jamshaid Arif

Hacker News Scraper: Stories, Comments, Users & Search

nominated_tupelo/hacker-news-scraper

Scrape Hacker News stories, comments, user profiles, and search by keyword using the official HN Firebase API and Algolia search API. No auth required.

kade

Hacker News Stories, Comments & Users Scraper

crawlerbros/hacker-news-scraper

Scrape Hacker News - search stories and comments, fetch top/new/best stories, get user profiles and submission history. Uses the official Algolia HN Search API and Hacker News Firebase API.

Crawler Bros

Hacker News Scraper

leftwinglautus/hacker-news-scraper

Scrape Hacker News stories and users via the HN Firebase API and Algolia search API. Supports search, top, new, best, and user modes.

Moeeze Hassan

Hacker News Scraper

cloud9_ai/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles via official Firebase API. Get top, new, best, ask, show stories with scores, comments, and author data.