Product Hunt Founders & Makers Scraper avatar

Product Hunt Founders & Makers Scraper

Pricing

from $3.00 / 1,000 product extracteds

Go to Apify Store
Product Hunt Founders & Makers Scraper

Product Hunt Founders & Makers Scraper

Scrape Product Hunt hunter + maker contact data (name, X handle, headline, profile URL) by slug. HTTP-only, no browser, no Cloudflare bypass — bypasses Cloudflare cleanly via Chrome TLS impersonation. Useful for outreach prospecting after PH launches.

Pricing

from $3.00 / 1,000 product extracteds

Rating

0.0

(0)

Developer

Arnas

Arnas

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 days ago

Last modified

Share

Product Hunt Maker Extractor

Scrape Product Hunt hunter and maker contact data — name, X (Twitter) handle, headline, profile URL — for a list of product slugs.

Returns one structured record per slug, with the hunter (the person who submitted the post) and the team (makers) extracted from PH's public Apollo SSR payload. Pure HTTP, no browser, no Cloudflare bypass — uses Apify Proxy + got-scraping's Chrome TLS impersonation.

Input

{
"slugs": ["lovable", "n8n-io"],
"includeMakerProfiles": true,
"includeHunter": true,
"maxConcurrency": 5,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}
FieldTypeDefaultNotes
slugsstring[]requiredBare slug (lovable) or full URL (https://www.producthunt.com/products/lovable). Legacy /posts/<slug> URLs are also accepted. Validated against /^[a-zA-Z0-9_-]{1,80}$/ before any HTTP fetch. Dedup is first-occurrence-wins.
expectedMakerCounts{ [slug]: integer }{}Optional cross-check map. If the scraper returns fewer makers than expected for a slug, the record is marked partial and a maker_count_mismatch warning is logged so silent under-extraction surfaces to ops.
includeMakerProfilesbooleantrueWhen false, only name, phUsername, phProfileUrl, and the product-page-derived headline are captured. Skips the per-maker profile-page fetch (saves request volume + cost).
includeHunterbooleantrueWhen false, the hunter block is omitted (set to null).
maxConcurrencyinteger5Concurrent HTTP requests. Capped at 30. PH is rate-limit-tolerant on residential proxies; the cap is a defensive ceiling.
requestTimeoutSecsinteger45Per-request wall-clock timeout.
proxyConfigurationobjectRESIDENTIALRESIDENTIAL is recommended; PH's WAF rate-limits aggressively on datacenter IPs.

Output

One dataset record per input slug:

{
"slug": "lovable",
"scrapedAt": "2026-04-29T13:35:09.907Z",
"productUrl": "https://www.producthunt.com/products/lovable",
"status": "ok",
"hunter": {
"name": "Chris Messina",
"phUsername": "chrismessina",
"phProfileUrl": "https://www.producthunt.com/@chrismessina",
"xHandle": "chrismessina",
"websiteUrl": null,
"linkedinUrl": null,
"headline": "🏆 #1 Hunter!"
},
"makers": [
{
"name": "Anton Osika",
"phUsername": "antonosika",
"phProfileUrl": "https://www.producthunt.com/@antonosika",
"xHandle": null,
"websiteUrl": null,
"linkedinUrl": null,
"headline": "Physicist, hacker, Founder, CTO"
}
],
"errors": []
}

status'ok' | 'partial' | 'product_not_found' | 'cloudflare_block' | 'rate_limited'. Per-record errors[] enumerates per-maker profile fetch failures (profile_404, cloudflare_block, navigation_timeout, parse_error).

A run-level summary is logged at end-of-run:

[run-summary-json] {"total":10,"succeeded":9,"productNotFound":0,"cloudflareBlocked":0,"rateLimited":0,"partial":1,"consecutiveBlocksAtEnd":0,"runOutcome":"normal"}

consecutiveBlocksAtEnd >= 3 is the operational signal for "session-wide regression" (PH banned the residential pool, or rolled out a new WAF rule).

Field availability notes

  • xHandle is populated when the maker has linked their X account on Product Hunt (~25–60% coverage). null when the user hasn't.
  • websiteUrl, linkedinUrl are always null in v0.2 — Product Hunt removed those fields from public profiles in 2025. They remain in the schema for backward compatibility. Caller-side enrichment (Clearbit, Hunter.io, Apollo) is required if needed.
  • headline comes from the product page's embedded payload; reused as the profile fallback when includeMakerProfiles=false.

Architecture

CheerioCrawler (Crawlee) → got-scraping for HTTP fetching with Chrome TLS impersonation. The product page (/products/<slug>) and profile page (/@<username>) both ship their data inline via Apollo's SSR data transport (window[Symbol.for("ApolloSSRDataTransport")].push({...})). The parser walks the embedded JSON with a brace-counting helper — no DOM, no headless browser, no JS execution.

Session pool with rotation on 401/403/429 (Crawlee handles automatically). Per-slug accumulator owns finalize-when-settled — the crawler writes parts; the accumulator pushes when product + all child profiles have a result.

This actor surfaces a privacy dimension; it does not assume responsibility for it.

  • LIA (legitimate interest assessment): the caller is responsible for documenting a lawful basis under GDPR Art. 6 for processing each maker's identity data.
  • Transparency notices (Art. 14): the caller is responsible for delivering the indirect-collection notice to data subjects within the required timeframe.
  • Erasure (Art. 17): the caller maintains the deletion path. This actor produces dataset output only — no persistence, no cache.
  • PH ToS: scraping public PH pages exists in a grey area; the caller is responsible for monitoring ToS changes.

Local development

cd actors/producthunt-maker-extractor
npm install
npm test # vitest run
npx tsc --noEmit # type-check
apify run -i '{"slugs":["lovable"]}' # local single-slug run

Deployment

$apify push # build + deploy to Apify

The Docker base is apify/actor-node:24 (Alpine, no browser). defaultRunOptions.memoryMbytes: 1024, timeoutSecs: 3600.

Versioning

See CHANGELOG.md.