Hacker News Search Scraper avatar

Hacker News Search Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Hacker News Search Scraper

Hacker News Search Scraper

Scrape Hacker News stories, comments and polls at scale via the Algolia API β€” title, author, URL, text, points, comment count and tags. Search by keyword, filter by points, thousands per run. Schedule it for a continuous HN feed.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Logiover

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 hours ago

Last modified

Share

🟠 Hacker News Search Scraper β€” Scrape HN Stories & Comments at Scale

Scrape Hacker News stories, comments and polls at scale through the official HN Algolia search API. This Hacker News scraper extracts title, author, URL, text, points, comment count, tags and created date β€” searchable by keyword and filterable by points. No login, no API key, no blocking. Thanks to date-windowed pagination it goes beyond Algolia's 1,000-result cap, returning thousands of items per run as JSON, CSV or Excel.

✨ What this Actor does / Key features

  • βš™οΈ Official HN Algolia API β€” fast, reliable, and not subject to anti-bot blocking.
  • πŸ“ˆ Beyond the 1,000-result cap β€” date-windowed pagination pulls tens of thousands of items per run.
  • πŸ—‚οΈ Every item type β€” stories, comments, polls, Show HN, Ask HN and front-page items.
  • πŸ”Ž Keyword search β€” track any topic, product, company or technology across all of Hacker News.
  • ⭐ Points filter β€” return only items with at least a minimum number of points.
  • πŸ“¦ Rich data per item β€” title, author, URL, text, points, comment count, tags, created date and HN link.
  • ♾️ Unlimited mode β€” set maxItems to 0 to pull everything matching your query.
  • ⏱️ Schedule-ready β€” built for recurring runs to maintain a continuously fresh HN feed.

πŸ” Input

FieldTypeDescription
querystringKeyword to search across Hacker News (e.g. "AI", "startup", "rust"). Leave empty to scrape all items.
itemTypestringWhat to scrape: story, comment, poll, show_hn, ask_hn or front_page.
minPointsintegerOnly items with at least this many points. 0 = no filter.
maxItemsintegerMaximum items to save. Uses date-windowed pagination to go beyond Algolia's 1,000-result cap. 0 = all.

πŸš€ Example input

{
"query": "AI",
"itemType": "story",
"minPoints": 10,
"maxItems": 0
}

πŸ“¦ Output

Each item is saved as a structured record in the dataset. Export to JSON, CSV, Excel or XML, or pull via the Apify API.

FieldDescription
objectIdUnique Hacker News item ID
typeItem type (story, comment, poll, etc.)
titleItem title
authorUsername of the author
urlExternal URL linked by the item (for stories)
textItem text body (for comments, Ask HN, polls)
pointsNumber of points / upvotes
numCommentsNumber of comments
tagsArray of HN/Algolia tags
createdAtDate the item was created
hnUrlDirect link to the item on Hacker News
scrapedAtScrape timestamp (ISO 8601)

πŸ’‘ Use cases

  • Brand & topic monitoring β€” track every Hacker News mention of your product, company or technology.
  • Trend research β€” analyze what the tech community is discussing and how interest shifts over time.
  • Datasets & newsletters β€” build a continuously updated Hacker News dataset for analysis or curation.
  • Sentiment & PR β€” catch discussions about your brand or industry early.
  • Developer & market research β€” study which tools, languages and startups gain traction on HN.
  • AI / LLM training data β€” collect structured tech-discussion text for model training and analysis.

❓ Frequently Asked Questions

Is it legal to scrape Hacker News? The Actor reads from the official public HN Algolia search API and only collects publicly available content. You are responsible for using the data in compliance with Hacker News' terms and applicable laws.

Do I need an API key or a login? No. There is no Hacker News account, login or API key required. You only need an Apify account to run the Actor.

How much data can I get? The Algolia API normally caps results at 1,000 per query, but this Actor uses date-windowed pagination to break past that limit β€” letting you pull tens of thousands of items per run. Set maxItems to 0 to capture everything matching your query.

Can I scrape comments, not just stories? Yes. The itemType field supports stories, comments, polls, Show HN, Ask HN and front-page items.

Can I filter by popularity? Yes. Use the minPoints filter to return only items above a points threshold.

How fresh is the data and can I schedule it? Data is pulled live at run time. Schedule the Actor on Apify with a keyword query for a continuously fresh feed of new Hacker News activity.

What output formats are supported? Results are stored in a structured Apify dataset and can be exported as JSON, CSV, Excel or XML, or accessed via the Apify API.

⏰ Scheduling & integration

Schedule this Actor on Apify to run on any cadence for a continuously fresh Hacker News feed. Export results to JSON, CSV or Excel, sync to Google Sheets, or push to dashboards, Slack/Discord alerts and webhooks through the Apify API.