Hacker News Scraper
Pricing
from $2.00 / 1,000 results
Hacker News Scraper
Scrape Hacker News stories, comments, and user profiles via official Firebase API. Get top, new, best, ask, show stories with scores, comments, and author data.
Hacker News Story Scraper
Extract tech stories, jobs, and discussions from Hacker News using the official Firebase API. Get top/best/new stories, Ask HN, Show HN, and job postings with full metadata.
Features
- Official API-based - Zero blocking, 100% reliability (uses
hacker-news.firebaseio.com) - 6 story types - Top, Best, New, Ask HN, Show HN, Jobs
- Score filtering - Filter by minimum upvotes
- Keyword search - Search in titles (case-insensitive)
- Full metadata - Title, URL, score, author, comments, timestamp, HN link
Use Cases
- Tech trend monitoring - Track trending technologies and discussions
- Startup/product research - Discover new products and startup launches
- Competitive intelligence - Monitor competitor mentions and discussions
- Content curation - Find quality content for newsletters/social media
- Recruitment - Browse job postings from tech companies
- Market research - Analyze what the tech community is interested in
Input Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
storyType | string | Yes | "top" | Type of stories: top, best, new, ask, show, job |
maxResults | number | No | 50 | Maximum stories to extract (1-500) |
minScore | number | No | - | Only include stories with this many upvotes or more |
keyword | string | No | - | Only include stories with this keyword in title |
Story Types
- Top Stories - Currently trending stories
- Best Stories - Best stories based on HN algorithm
- New Stories - Most recently submitted stories
- Ask HN - Questions and discussions
- Show HN - Project/product showcases
- Jobs - Job postings
Output Format
Each story includes:
{"id": 39631123,"title": "Show HN: I built a tool to analyze Hacker News trends","url": "https://example.com/hn-analyzer","score": 342,"author": "techfounder","commentCount": 87,"postedAt": "2024-02-12T10:30:00.000Z","type": "story","hnUrl": "https://news.ycombinator.com/item?id=39631123","scrapedAt": "2024-02-12T15:45:00.000Z"}
Field Descriptions
- id - Unique HN story ID
- title - Story title
- url - External URL (null for Ask HN/text posts)
- score - Number of upvotes
- author - HN username of submitter
- commentCount - Number of comments/discussions
- postedAt - Submission timestamp (ISO 8601)
- type - Story type (story, job, poll, etc.)
- hnUrl - Direct link to HN discussion page
- scrapedAt - Timestamp when data was extracted
Example Usage
Top 30 AI-related stories with minimum 50 upvotes
{"storyType": "top","maxResults": 30,"minScore": 50,"keyword": "AI"}
Recent Show HN projects
{"storyType": "show","maxResults": 100,"minScore": 10}
Job postings from YC companies
{"storyType": "job","maxResults": 50,"keyword": "YC"}
Best Ask HN questions
{"storyType": "ask","maxResults": 25,"minScore": 100}
Pricing
Approximately $2.50 per 1,000 stories (based on compute units)
Cost Estimation
| Stories | Approx. Cost | Duration |
|---|---|---|
| 50 | $0.12 | ~30 seconds |
| 100 | $0.25 | ~1 minute |
| 500 | $1.25 | ~5 minutes |
Costs include API calls and rate limiting (0.5s between requests)
Tips & Best Practices
Filtering Strategy
If you need 50 stories with specific filters (minScore/keyword):
- Set
maxResultshigher (100-150) to account for filtered items - The actor fetches up to 2x maxResults to ensure enough matches
Story Type Selection
- Top - Most balanced view of current trending content
- Best - Highest quality stories (better signal-to-noise)
- New - Real-time monitoring, catch stories early
- Ask - Community discussions, Q&A, career advice
- Show - New product launches, side projects
- Job - Tech job opportunities, mostly from startups
Rate Limiting
- Actor respects HN API with 0.5s delay between requests
- 50 stories = ~30 seconds
- 500 stories = ~5 minutes
- No risk of being blocked (official API)
Data Freshness
- Stories are fetched in real-time from HN API
- Top/Best/New lists update frequently (every few minutes)
- Job postings update less frequently
Keyword Matching
- Case-insensitive search
- Matches anywhere in title
- Examples: "AI", "LLM", "YC", "startup", "open source"
- For multiple keywords, run separate actors and merge results
Technical Details
API Endpoints Used
- Story IDs:
https://hacker-news.firebaseio.com/v0/{type}stories.json - Story details:
https://hacker-news.firebaseio.com/v0/item/{id}.json
Rate Limiting
- 0.5 second delay between story detail requests
- Public API, no authentication required
- No IP blocking or rate limits
Error Handling
- Continues on individual story fetch failures
- Logs warnings for failed requests
- Returns all successfully fetched stories
Data Quality
- All data comes directly from HN official API
- No web scraping, no parsing errors
- 100% reliability and accuracy
Common Use Cases
1. Startup Trend Analysis
Track what startups are launching and getting traction:
{"storyType": "show","maxResults": 200,"minScore": 20}
2. AI/ML News Monitoring
Stay updated on AI developments:
{"storyType": "best","maxResults": 100,"keyword": "AI"}
3. Job Board Scraping
Build a job aggregator:
{"storyType": "job","maxResults": 500}
4. Content Curation
Find high-quality content for newsletters:
{"storyType": "best","maxResults": 50,"minScore": 100}
Limitations
- Maximum 500 stories per run (API limitation)
- Keyword search is simple substring match (not full-text search)
- Rate limited to ~120 stories/minute (to respect HN API)
- No access to comment content (only comment counts)
Support
For issues or feature requests, please contact the actor maintainer.
License
This actor is provided as-is for use on the Apify platform.
