Hacker News Search Scraper
Pricing
from $3.00 / 1,000 results
Hacker News Search Scraper
Scrape Hacker News stories, comments and polls at scale via the Algolia API β title, author, URL, text, points, comment count and tags. Search by keyword, filter by points, thousands per run. Schedule it for a continuous HN feed.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Logiover
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 hours ago
Last modified
Categories
Share
π Hacker News Search Scraper β Scrape HN Stories & Comments at Scale
Scrape Hacker News stories, comments and polls at scale through the official HN Algolia search API. This Hacker News scraper extracts title, author, URL, text, points, comment count, tags and created date β searchable by keyword and filterable by points. No login, no API key, no blocking. Thanks to date-windowed pagination it goes beyond Algolia's 1,000-result cap, returning thousands of items per run as JSON, CSV or Excel.
β¨ What this Actor does / Key features
- βοΈ Official HN Algolia API β fast, reliable, and not subject to anti-bot blocking.
- π Beyond the 1,000-result cap β date-windowed pagination pulls tens of thousands of items per run.
- ποΈ Every item type β stories, comments, polls, Show HN, Ask HN and front-page items.
- π Keyword search β track any topic, product, company or technology across all of Hacker News.
- β Points filter β return only items with at least a minimum number of points.
- π¦ Rich data per item β title, author, URL, text, points, comment count, tags, created date and HN link.
- βΎοΈ Unlimited mode β set
maxItemsto 0 to pull everything matching your query. - β±οΈ Schedule-ready β built for recurring runs to maintain a continuously fresh HN feed.
π Input
| Field | Type | Description |
|---|---|---|
query | string | Keyword to search across Hacker News (e.g. "AI", "startup", "rust"). Leave empty to scrape all items. |
itemType | string | What to scrape: story, comment, poll, show_hn, ask_hn or front_page. |
minPoints | integer | Only items with at least this many points. 0 = no filter. |
maxItems | integer | Maximum items to save. Uses date-windowed pagination to go beyond Algolia's 1,000-result cap. 0 = all. |
π Example input
{"query": "AI","itemType": "story","minPoints": 10,"maxItems": 0}
π¦ Output
Each item is saved as a structured record in the dataset. Export to JSON, CSV, Excel or XML, or pull via the Apify API.
| Field | Description |
|---|---|
objectId | Unique Hacker News item ID |
type | Item type (story, comment, poll, etc.) |
title | Item title |
author | Username of the author |
url | External URL linked by the item (for stories) |
text | Item text body (for comments, Ask HN, polls) |
points | Number of points / upvotes |
numComments | Number of comments |
tags | Array of HN/Algolia tags |
createdAt | Date the item was created |
hnUrl | Direct link to the item on Hacker News |
scrapedAt | Scrape timestamp (ISO 8601) |
π‘ Use cases
- Brand & topic monitoring β track every Hacker News mention of your product, company or technology.
- Trend research β analyze what the tech community is discussing and how interest shifts over time.
- Datasets & newsletters β build a continuously updated Hacker News dataset for analysis or curation.
- Sentiment & PR β catch discussions about your brand or industry early.
- Developer & market research β study which tools, languages and startups gain traction on HN.
- AI / LLM training data β collect structured tech-discussion text for model training and analysis.
β Frequently Asked Questions
Is it legal to scrape Hacker News? The Actor reads from the official public HN Algolia search API and only collects publicly available content. You are responsible for using the data in compliance with Hacker News' terms and applicable laws.
Do I need an API key or a login? No. There is no Hacker News account, login or API key required. You only need an Apify account to run the Actor.
How much data can I get?
The Algolia API normally caps results at 1,000 per query, but this Actor uses date-windowed pagination to break past that limit β letting you pull tens of thousands of items per run. Set maxItems to 0 to capture everything matching your query.
Can I scrape comments, not just stories?
Yes. The itemType field supports stories, comments, polls, Show HN, Ask HN and front-page items.
Can I filter by popularity?
Yes. Use the minPoints filter to return only items above a points threshold.
How fresh is the data and can I schedule it? Data is pulled live at run time. Schedule the Actor on Apify with a keyword query for a continuously fresh feed of new Hacker News activity.
What output formats are supported? Results are stored in a structured Apify dataset and can be exported as JSON, CSV, Excel or XML, or accessed via the Apify API.
β° Scheduling & integration
Schedule this Actor on Apify to run on any cadence for a continuously fresh Hacker News feed. Export results to JSON, CSV or Excel, sync to Google Sheets, or push to dashboards, Slack/Discord alerts and webhooks through the Apify API.