Product Hunt Founders & Makers Scraper
Pricing
from $3.00 / 1,000 product extracteds
Product Hunt Founders & Makers Scraper
Scrape Product Hunt hunter + maker contact data (name, X handle, headline, profile URL) by slug. HTTP-only, no browser, no Cloudflare bypass — bypasses Cloudflare cleanly via Chrome TLS impersonation. Useful for outreach prospecting after PH launches.
Pricing
from $3.00 / 1,000 product extracteds
Rating
0.0
(0)
Developer
Arnas
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 days ago
Last modified
Categories
Share
Product Hunt Maker Extractor
Scrape Product Hunt hunter and maker contact data — name, X (Twitter) handle, headline, profile URL — for a list of product slugs.
Returns one structured record per slug, with the hunter (the person who submitted the post) and the team (makers) extracted from PH's public Apollo SSR payload. Pure HTTP, no browser, no Cloudflare bypass — uses Apify Proxy + got-scraping's Chrome TLS impersonation.
Input
{"slugs": ["lovable", "n8n-io"],"includeMakerProfiles": true,"includeHunter": true,"maxConcurrency": 5,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
| Field | Type | Default | Notes |
|---|---|---|---|
slugs | string[] | required | Bare slug (lovable) or full URL (https://www.producthunt.com/products/lovable). Legacy /posts/<slug> URLs are also accepted. Validated against /^[a-zA-Z0-9_-]{1,80}$/ before any HTTP fetch. Dedup is first-occurrence-wins. |
expectedMakerCounts | { [slug]: integer } | {} | Optional cross-check map. If the scraper returns fewer makers than expected for a slug, the record is marked partial and a maker_count_mismatch warning is logged so silent under-extraction surfaces to ops. |
includeMakerProfiles | boolean | true | When false, only name, phUsername, phProfileUrl, and the product-page-derived headline are captured. Skips the per-maker profile-page fetch (saves request volume + cost). |
includeHunter | boolean | true | When false, the hunter block is omitted (set to null). |
maxConcurrency | integer | 5 | Concurrent HTTP requests. Capped at 30. PH is rate-limit-tolerant on residential proxies; the cap is a defensive ceiling. |
requestTimeoutSecs | integer | 45 | Per-request wall-clock timeout. |
proxyConfiguration | object | RESIDENTIAL | RESIDENTIAL is recommended; PH's WAF rate-limits aggressively on datacenter IPs. |
Output
One dataset record per input slug:
{"slug": "lovable","scrapedAt": "2026-04-29T13:35:09.907Z","productUrl": "https://www.producthunt.com/products/lovable","status": "ok","hunter": {"name": "Chris Messina","phUsername": "chrismessina","phProfileUrl": "https://www.producthunt.com/@chrismessina","xHandle": "chrismessina","websiteUrl": null,"linkedinUrl": null,"headline": "🏆 #1 Hunter!"},"makers": [{"name": "Anton Osika","phUsername": "antonosika","phProfileUrl": "https://www.producthunt.com/@antonosika","xHandle": null,"websiteUrl": null,"linkedinUrl": null,"headline": "Physicist, hacker, Founder, CTO"}],"errors": []}
status ∈ 'ok' | 'partial' | 'product_not_found' | 'cloudflare_block' | 'rate_limited'. Per-record errors[] enumerates per-maker profile fetch failures (profile_404, cloudflare_block, navigation_timeout, parse_error).
A run-level summary is logged at end-of-run:
[run-summary-json] {"total":10,"succeeded":9,"productNotFound":0,"cloudflareBlocked":0,"rateLimited":0,"partial":1,"consecutiveBlocksAtEnd":0,"runOutcome":"normal"}
consecutiveBlocksAtEnd >= 3 is the operational signal for "session-wide regression" (PH banned the residential pool, or rolled out a new WAF rule).
Field availability notes
xHandleis populated when the maker has linked their X account on Product Hunt (~25–60% coverage).nullwhen the user hasn't.websiteUrl,linkedinUrlare alwaysnullin v0.2 — Product Hunt removed those fields from public profiles in 2025. They remain in the schema for backward compatibility. Caller-side enrichment (Clearbit, Hunter.io, Apollo) is required if needed.headlinecomes from the product page's embedded payload; reused as the profile fallback whenincludeMakerProfiles=false.
Architecture
CheerioCrawler (Crawlee) → got-scraping for HTTP fetching with Chrome TLS impersonation. The product page (/products/<slug>) and profile page (/@<username>) both ship their data inline via Apollo's SSR data transport (window[Symbol.for("ApolloSSRDataTransport")].push({...})). The parser walks the embedded JSON with a brace-counting helper — no DOM, no headless browser, no JS execution.
Session pool with rotation on 401/403/429 (Crawlee handles automatically). Per-slug accumulator owns finalize-when-settled — the crawler writes parts; the accumulator pushes when product + all child profiles have a result.
GDPR / legal — caller responsibility
This actor surfaces a privacy dimension; it does not assume responsibility for it.
- LIA (legitimate interest assessment): the caller is responsible for documenting a lawful basis under GDPR Art. 6 for processing each maker's identity data.
- Transparency notices (Art. 14): the caller is responsible for delivering the indirect-collection notice to data subjects within the required timeframe.
- Erasure (Art. 17): the caller maintains the deletion path. This actor produces dataset output only — no persistence, no cache.
- PH ToS: scraping public PH pages exists in a grey area; the caller is responsible for monitoring ToS changes.
Local development
cd actors/producthunt-maker-extractornpm installnpm test # vitest runnpx tsc --noEmit # type-checkapify run -i '{"slugs":["lovable"]}' # local single-slug run
Deployment
$apify push # build + deploy to Apify
The Docker base is apify/actor-node:24 (Alpine, no browser). defaultRunOptions.memoryMbytes: 1024, timeoutSecs: 3600.
Versioning
See CHANGELOG.md.