Hacker News Comment Scraper avatar

Hacker News Comment Scraper

Pricing

Pay per usage

Go to Apify Store
Hacker News Comment Scraper

Hacker News Comment Scraper

Scrape nested Hacker News comments by story URL or keyword search. Extract author, timestamp, text, and reply trees. Built for developers building HN-powered integrations and analytics tools.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

pit 2017

pit 2017

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrape nested comments from Hacker News stories by URL or keyword search. Extract author, timestamp, text, and reply trees.

Apify: https://console.apify.com/actors/ebS02wB1m9aZkUWL5
GitHub: https://github.com/xiaclaw2018/devnest (path: projects/apify-scraper/actors/hackernews-comment-scraper/)
技术栈: Python 3.11 + requests + BeautifulSoup4 + lxml (无需 Playwright!)

Features

  • Two modes: URL scrape or keyword search
  • Search via Algolia HN API — find stories by keyword
  • Extract: author, timestamp, comment text, nested reply trees
  • Lightweight: ~200MB image (vs Playwright's ~1GB+)
  • Fast: <5s for 50 comments
  • Reliable: requests + BeautifulSoup, no JavaScript rendering needed

Input

{
"mode": "url",
"storyUrl": "https://news.ycombinator.com/item?id=9999999",
"maxComments": 50,
"maxReplies": 5
}

Or search by keyword:

{
"mode": "search",
"searchQuery": "AI startup funding 2026",
"maxComments": 50,
"maxReplies": 5
}

Output

{
"items": [
{
"author": "tptacek",
"text": "The fundamental problem here is that...",
"time": "2026-04-24T12:30:00",
"depth": 0,
"replies": [
{"author": "jacquesm", "text": "I'd push back slightly on..."}
]
}
]
}

Use Cases

  • Build HN analytics dashboards
  • Monitor discussions about your product/tech
  • Train datasets on developer discussions
  • Power HN-based recommendation engines
  • Content aggregation for newsletters

Deploy Your Own

# Install Apify CLI
npm install -g apify-cli
# Initialize
apify init --template python hackernews-comment-scraper
cd hackernews-comment-scraper
# Replace src/main.py with the code
# Update Dockerfile:
# FROM python:3.11-slim
# RUN pip install --no-cache-dir requests beautifulsoup4 lxml
# CMD ["python3", "src/main.py"]
# Push to Apify
apify actors push

Tutorial

Full tutorial article: ../../articles/hackernews-comment-scraper.md

License

MIT