Product Hunt Scraper — Comments, Products & User Profiles
Pricing
from $5.00 / 1,000 products
Product Hunt Scraper — Comments, Products & User Profiles
Scrape Product Hunt comments, product details, and user profiles at scale. Extract reviews, upvotes, maker badges, product descriptions, media, launch rankings, and commenter social links (Twitter, LinkedIn, GitHub). Supports daily leaderboard URLs and individual product pages.
Pricing
from $5.00 / 1,000 products
Rating
0.0
(0)
Developer
VulnV
Actor stats
1
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Scrape Product Hunt comments, product data, and user profiles at scale. Extract reviews, upvotes, maker badges, product descriptions, screenshots, launch rankings, categories, and commenter social media links — all in structured JSON.
Works with individual product URLs, post URLs, and daily leaderboard URLs (automatically expanded into ranked products).
Key features
- Comments & reviews — full text (plain + HTML), vote counts, maker/hunter badges, timestamps, sticky/pinned flags
- Product profiles — name, tagline, description, logo, screenshots, videos, website, social links, categories, awards, launch rank and score, hunter info
- User profiles — name, headline, bio, avatar, Twitter, LinkedIn, GitHub, personal website, follower counts, verified status
- Leaderboard support — pass a
/leaderboard/daily/YYYY/M/DURL to scrape the top N products from any day - Three structured datasets with row-level cross-references between comments, products, and users
- Toggleable outputs — enable/disable comments, products, or user profile scraping independently
- Concurrent scraping — processes multiple products, comment pages, and user profiles in parallel
Use cases
- Market research — analyze Product Hunt launch performance, comment sentiment, and community reception
- Competitor analysis — compare products by upvotes, reviews, ratings, categories, and launch rankings
- Lead generation — extract commenter profiles with Twitter, LinkedIn, GitHub, and website links
- Trend tracking — monitor daily leaderboards to identify trending products and categories
- Content research — study what products and launches generate the most engagement
What data you get
| Dataset | Contents |
|---|---|
| Comments (default) | Comment text, HTML body, vote count, badges, timestamps, author reference, product reference |
| Products | Name, slug, tagline, description, logo, media, website, social URLs, categories, awards, launch details, review stats |
| Users | Name, username, headline, avatar, profile URL, Twitter, LinkedIn, GitHub, website, follower/following counts, verified status |
Input parameters
| Field | Type | Default | Description |
|---|---|---|---|
start_urls | array | required | Product Hunt URLs to scrape — accepts /products/…, /posts/…, or /leaderboard/daily/YYYY/M/D |
max_products | integer | 1 | Maximum products to scrape (top N for leaderboards, cap for direct URLs) |
max_comments | integer | 100 | Maximum comments to extract per product |
scrape_comments | boolean | true | Save comments to the default dataset |
scrape_products | boolean | true | Save product info to the products dataset |
scrape_users | boolean | true | Fetch and save commenter profiles to the users dataset |
Example: scrape top 5 products from the daily leaderboard
{"start_urls": [{ "url": "https://www.producthunt.com/leaderboard/daily/2026/3/23" }],"max_products": 5,"max_comments": 50,"scrape_comments": true,"scrape_products": true,"scrape_users": true}
Example: scrape a single product page
{"start_urls": [{ "url": "https://www.producthunt.com/products/tobira-ai" }],"max_products": 1,"max_comments": 200}
Example: scrape only product metadata (no comments or users)
{"start_urls": [{ "url": "https://www.producthunt.com/leaderboard/daily/2026/3/23" }],"max_products": 10,"scrape_comments": false,"scrape_products": true,"scrape_users": false}
Output format
Results are split into three datasets. The Apify Console Output tab links directly to each dataset with pre-configured table views.
Comments dataset (default)
{"type": "comment","id": "5217461","body": "Hey PH! ...","body_html": "<p>Hey PH!</p>...","vote_count": 21,"created_at": "2026-03-17T06:24:57-07:00","is_sticky": true,"is_pinned": false,"badges": ["maker"],"award": null,"product_name": "Tobira.ai","product_tagline": "A network where AI agents find deals for their humans","product_url": "https://www.producthunt.com/posts/tobira-ai","user": { "id": "4231147", "name": "Vlad Shipilov", "username": "vlad_shipilov" },"user_id": "4231147","username": "vlad_shipilov","product_id": "1183905","product_slug": "tobira-ai"}
Products dataset
{"id": "1183905","name": "Tobira.ai","slug": "tobira-ai","tagline": "A network where AI agents find deals for their humans","description": "...","url": "https://www.producthunt.com/products/tobira-ai","logo_url": "https://ph-files.imgix.net/...","website_url": "https://tobira.ai","twitter_url": "https://twitter.com/shipilov_vlad","followers_count": 582,"reviews_count": 0,"reviews_rating": 0,"categories": [{ "name": "AI Agents", "slug": "ai-agents", "path": "/categories/ai-agents" }],"awards": [{ "position": 1, "period": "daily", "date": "2026-03-23" }],"media": [{ "type": "image", "url": "https://ph-files.imgix.net/..." }],"launch": {"id": "1100794","slug": "tobira-ai","name": "Tobira.ai","daily_rank": "1","launch_day_score": 447,"featured": true,"featured_at": "2026-03-23T00:01:00-07:00","primary_link": "https://tobira.ai","hunter": { "name": "fmerian", "username": "fmerian" }}}
Users dataset
{"id": "4231147","name": "Vlad Shipilov","username": "vlad_shipilov","headline": "Founder Tobira.ai and Revoly.ai","profile_url": "https://www.producthunt.com/@vlad_shipilov","avatar_url": "https://ph-avatars.imgix.net/...","twitter_username": "VladShipilov","website_url": null,"linkedin_url": null,"github_username": null,"followers_count": 134,"is_verified": true,"social_links": [{ "type": "twitter", "url": "https://twitter.com/VladShipilov" }]}
Cross-dataset references
Every saved record includes a storage object with row-level pointers for joining data across datasets:
| Dataset | storage fields |
|---|---|
| Comments | storage.item (this row), storage.product (parent product), storage.user (commenter profile) |
| Products | storage.item (this row) |
| Users | storage.item (this row) |
Each pointer contains dataset_id, dataset_name, item_offset, api_url, and console_url for direct API access.
A DATASET_LINKS manifest is also written to the default key-value store with dataset IDs and console URLs for all three datasets.
How it works
- Leaderboard expansion — leaderboard URLs are fetched via Zyte API's browser rendering and parsed for ranked product links
- Apollo SSR data extraction — product pages are rendered in a headless browser; the scraper parses
window[Symbol.for("ApolloSSRDataTransport")]for structured data from Product Hunt's Next.js/Apollo SSR layer - GraphQL enrichment — a parallel persisted GraphQL query (
ProductsPageLayout) fetches additional fields like description, media, categories, awards, and external links - Comment pagination — pages beyond the first are fetched concurrently in batches of 5
- User profile scraping — commenter profiles are fetched in batches of 10 and deduplicated across all products in the run
- Product concurrency — up to 5 products are scraped in parallel
Pricing
This actor uses Apify's pay-per-event model. You only pay for the data you save:
| Event | Triggered when |
|---|---|
fetch_product | A product record is saved |
fetch_comment | A comment record is saved |
fetch_user | A user profile is saved |
Disable any dataset type (e.g. scrape_users: false) to skip both the scraping and the charge.
Technical details
- Runtime: Python 3.13, Apify SDK 3.3+
- Dependencies: httpx, beautifulsoup4, lxml
- External service: Zyte API for browser-rendered HTML and HTTP proxy requests
- Memory: 512 MB