Hacker News Comment Scraper
Pricing
Pay per usage
Hacker News Comment Scraper
Scrape nested Hacker News comments by story URL or keyword search. Extract author, timestamp, text, and reply trees. Built for developers building HN-powered integrations and analytics tools.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
pit 2017
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Scrape nested comments from Hacker News stories by URL or keyword search. Extract author, timestamp, text, and reply trees.
Apify: https://console.apify.com/actors/ebS02wB1m9aZkUWL5
GitHub: https://github.com/xiaclaw2018/devnest (path: projects/apify-scraper/actors/hackernews-comment-scraper/)
技术栈: Python 3.11 + requests + BeautifulSoup4 + lxml (无需 Playwright!)
Features
- Two modes: URL scrape or keyword search
- Search via Algolia HN API — find stories by keyword
- Extract: author, timestamp, comment text, nested reply trees
- Lightweight: ~200MB image (vs Playwright's ~1GB+)
- Fast: <5s for 50 comments
- Reliable: requests + BeautifulSoup, no JavaScript rendering needed
Input
{"mode": "url","storyUrl": "https://news.ycombinator.com/item?id=9999999","maxComments": 50,"maxReplies": 5}
Or search by keyword:
{"mode": "search","searchQuery": "AI startup funding 2026","maxComments": 50,"maxReplies": 5}
Output
{"items": [{"author": "tptacek","text": "The fundamental problem here is that...","time": "2026-04-24T12:30:00","depth": 0,"replies": [{"author": "jacquesm", "text": "I'd push back slightly on..."}]}]}
Use Cases
- Build HN analytics dashboards
- Monitor discussions about your product/tech
- Train datasets on developer discussions
- Power HN-based recommendation engines
- Content aggregation for newsletters
Deploy Your Own
# Install Apify CLInpm install -g apify-cli# Initializeapify init --template python hackernews-comment-scrapercd hackernews-comment-scraper# Replace src/main.py with the code# Update Dockerfile:# FROM python:3.11-slim# RUN pip install --no-cache-dir requests beautifulsoup4 lxml# CMD ["python3", "src/main.py"]# Push to Apifyapify actors push
Tutorial
Full tutorial article: ../../articles/hackernews-comment-scraper.md
License
MIT