# Product Hunt Founders & Makers Scraper (`prodiger/producthunt-maker-extractor`) Actor

Scrape Product Hunt hunter + maker contact data (name, X handle, headline, profile URL) by slug. HTTP-only, no browser, no Cloudflare bypass — bypasses Cloudflare cleanly via Chrome TLS impersonation. Useful for outreach prospecting after PH launches.

- **URL**: https://apify.com/prodiger/producthunt-maker-extractor.md
- **Developed by:** [Arnas](https://apify.com/prodiger) (community)
- **Categories:** Lead generation, Social media, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 product extracteds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Product Hunt Maker Extractor

Scrape Product Hunt **hunter and maker contact data** — name, X (Twitter) handle, headline, profile URL — for a list of product slugs.

Returns one structured record per slug, with the hunter (the person who submitted the post) and the team (makers) extracted from PH's public Apollo SSR payload. Pure HTTP, no browser, no Cloudflare bypass — uses Apify Proxy + got-scraping's Chrome TLS impersonation.

### Input

```json
{
    "slugs": ["lovable", "n8n-io"],
    "includeMakerProfiles": true,
    "includeHunter": true,
    "maxConcurrency": 5,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}
````

| Field | Type | Default | Notes |
|---|---|---|---|
| `slugs` | `string[]` | required | Bare slug (`lovable`) or full URL (`https://www.producthunt.com/products/lovable`). Legacy `/posts/<slug>` URLs are also accepted. Validated against `/^[a-zA-Z0-9_-]{1,80}$/` before any HTTP fetch. Dedup is first-occurrence-wins. |
| `expectedMakerCounts` | `{ [slug]: integer }` | `{}` | Optional cross-check map. If the scraper returns fewer makers than expected for a slug, the record is marked `partial` and a `maker_count_mismatch` warning is logged so silent under-extraction surfaces to ops. |
| `includeMakerProfiles` | boolean | `true` | When `false`, only `name`, `phUsername`, `phProfileUrl`, and the product-page-derived `headline` are captured. Skips the per-maker profile-page fetch (saves request volume + cost). |
| `includeHunter` | boolean | `true` | When `false`, the `hunter` block is omitted (set to `null`). |
| `maxConcurrency` | integer | `5` | Concurrent HTTP requests. Capped at 30. PH is rate-limit-tolerant on residential proxies; the cap is a defensive ceiling. |
| `requestTimeoutSecs` | integer | `45` | Per-request wall-clock timeout. |
| `proxyConfiguration` | object | RESIDENTIAL | RESIDENTIAL is recommended; PH's WAF rate-limits aggressively on datacenter IPs. |

### Output

One dataset record per input slug:

```json
{
    "slug": "lovable",
    "scrapedAt": "2026-04-29T13:35:09.907Z",
    "productUrl": "https://www.producthunt.com/products/lovable",
    "status": "ok",
    "hunter": {
        "name": "Chris Messina",
        "phUsername": "chrismessina",
        "phProfileUrl": "https://www.producthunt.com/@chrismessina",
        "xHandle": "chrismessina",
        "websiteUrl": null,
        "linkedinUrl": null,
        "headline": "🏆 #1 Hunter!"
    },
    "makers": [
        {
            "name": "Anton Osika",
            "phUsername": "antonosika",
            "phProfileUrl": "https://www.producthunt.com/@antonosika",
            "xHandle": null,
            "websiteUrl": null,
            "linkedinUrl": null,
            "headline": "Physicist, hacker, Founder, CTO"
        }
    ],
    "errors": []
}
```

`status` ∈ `'ok' | 'partial' | 'product_not_found' | 'cloudflare_block' | 'rate_limited'`. Per-record `errors[]` enumerates per-maker profile fetch failures (`profile_404`, `cloudflare_block`, `navigation_timeout`, `parse_error`).

A run-level summary is logged at end-of-run:

```
[run-summary-json] {"total":10,"succeeded":9,"productNotFound":0,"cloudflareBlocked":0,"rateLimited":0,"partial":1,"consecutiveBlocksAtEnd":0,"runOutcome":"normal"}
```

`consecutiveBlocksAtEnd >= 3` is the operational signal for "session-wide regression" (PH banned the residential pool, or rolled out a new WAF rule).

#### Field availability notes

- **`xHandle`** is populated when the maker has linked their X account on Product Hunt (~25–60% coverage). `null` when the user hasn't.
- **`websiteUrl`, `linkedinUrl`** are always `null` in v0.2 — Product Hunt removed those fields from public profiles in 2025. They remain in the schema for backward compatibility. Caller-side enrichment (Clearbit, Hunter.io, Apollo) is required if needed.
- **`headline`** comes from the product page's embedded payload; reused as the profile fallback when `includeMakerProfiles=false`.

### Architecture

`CheerioCrawler` (Crawlee) → `got-scraping` for HTTP fetching with Chrome TLS impersonation. The product page (`/products/<slug>`) and profile page (`/@<username>`) both ship their data inline via Apollo's SSR data transport (`window[Symbol.for("ApolloSSRDataTransport")].push({...})`). The parser walks the embedded JSON with a brace-counting helper — no DOM, no headless browser, no JS execution.

Session pool with rotation on 401/403/429 (Crawlee handles automatically). Per-slug accumulator owns finalize-when-settled — the crawler writes parts; the accumulator pushes when product + all child profiles have a result.

### GDPR / legal — caller responsibility

This actor surfaces a privacy dimension; it does **not** assume responsibility for it.

- **LIA (legitimate interest assessment):** the caller is responsible for documenting a lawful basis under GDPR Art. 6 for processing each maker's identity data.
- **Transparency notices (Art. 14):** the caller is responsible for delivering the indirect-collection notice to data subjects within the required timeframe.
- **Erasure (Art. 17):** the caller maintains the deletion path. This actor produces dataset output only — no persistence, no cache.
- **PH ToS:** scraping public PH pages exists in a grey area; the caller is responsible for monitoring ToS changes.

### Local development

```bash
cd actors/producthunt-maker-extractor
npm install
npm test                                    ## vitest run
npx tsc --noEmit                            ## type-check
apify run -i '{"slugs":["lovable"]}'        ## local single-slug run
```

### Deployment

```bash
apify push                                  ## build + deploy to Apify
```

The Docker base is `apify/actor-node:24` (Alpine, no browser). `defaultRunOptions.memoryMbytes: 1024`, `timeoutSecs: 3600`.

### Versioning

See `CHANGELOG.md`.

# Actor input Schema

## `slugs` (type: `array`):

Product Hunt product slugs (e.g. 'lovable', 'n8n-io'). Bare slug or full URL `https://www.producthunt.com/products/<slug>` (or legacy `/posts/<slug>`) both accepted. Each is validated against /^\[a-zA-Z0-9\_-]{1,80}$/ before any HTTP fetch.

## `expectedMakerCounts` (type: `object`):

Optional cross-check map: { slug: integer }. If the scraper extracts fewer makers than expected, the per-slug record is marked status='partial' with a 'maker\_count\_mismatch' note. Defends against silent under-extraction when PH changes its embedded JSON shape.

## `includeMakerProfiles` (type: `boolean`):

When true (default), follow each maker's /@<username> profile page to extract X handle. When false, only name + username + profile URL + product-page-derived headline are captured (xHandle = null). Saves ~1 request per maker.

## `includeHunter` (type: `boolean`):

When true (default), extract the hunter (the person who submitted the post) in addition to the makers list.

## `maxConcurrency` (type: `integer`):

5 is the recommended default; 30 is the cap. PH is rate-limit-tolerant on residential proxies but the cap is a defensive ceiling.

## `requestTimeoutSecs` (type: `integer`):

Wall-clock timeout for one HTTP fetch.

## `proxyConfiguration` (type: `object`):

RESIDENTIAL is recommended; PH's WAF rate-limits aggressively on datacenter IPs.

## Actor input object example

```json
{
  "slugs": [
    "lovable"
  ],
  "includeMakerProfiles": true,
  "includeHunter": true,
  "maxConcurrency": 5,
  "requestTimeoutSecs": 45,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "slugs": [
        "lovable"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("prodiger/producthunt-maker-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "slugs": ["lovable"],
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("prodiger/producthunt-maker-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "slugs": [
    "lovable"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call prodiger/producthunt-maker-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=prodiger/producthunt-maker-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Product Hunt Founders & Makers Scraper",
        "description": "Scrape Product Hunt hunter + maker contact data (name, X handle, headline, profile URL) by slug. HTTP-only, no browser, no Cloudflare bypass — bypasses Cloudflare cleanly via Chrome TLS impersonation. Useful for outreach prospecting after PH launches.",
        "version": "0.2",
        "x-build-id": "xyIXckLMWg4fYedMt"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/prodiger~producthunt-maker-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-prodiger-producthunt-maker-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/prodiger~producthunt-maker-extractor/runs": {
            "post": {
                "operationId": "runs-sync-prodiger-producthunt-maker-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/prodiger~producthunt-maker-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-prodiger-producthunt-maker-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "slugs"
                ],
                "properties": {
                    "slugs": {
                        "title": "Product Hunt slugs",
                        "type": "array",
                        "description": "Product Hunt product slugs (e.g. 'lovable', 'n8n-io'). Bare slug or full URL `https://www.producthunt.com/products/<slug>` (or legacy `/posts/<slug>`) both accepted. Each is validated against /^[a-zA-Z0-9_-]{1,80}$/ before any HTTP fetch.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "expectedMakerCounts": {
                        "title": "Expected maker counts (optional)",
                        "type": "object",
                        "description": "Optional cross-check map: { slug: integer }. If the scraper extracts fewer makers than expected, the per-slug record is marked status='partial' with a 'maker_count_mismatch' note. Defends against silent under-extraction when PH changes its embedded JSON shape."
                    },
                    "includeMakerProfiles": {
                        "title": "Include maker profile data",
                        "type": "boolean",
                        "description": "When true (default), follow each maker's /@<username> profile page to extract X handle. When false, only name + username + profile URL + product-page-derived headline are captured (xHandle = null). Saves ~1 request per maker.",
                        "default": true
                    },
                    "includeHunter": {
                        "title": "Include hunter block",
                        "type": "boolean",
                        "description": "When true (default), extract the hunter (the person who submitted the post) in addition to the makers list.",
                        "default": true
                    },
                    "maxConcurrency": {
                        "title": "Max concurrent HTTP requests",
                        "minimum": 1,
                        "maximum": 30,
                        "type": "integer",
                        "description": "5 is the recommended default; 30 is the cap. PH is rate-limit-tolerant on residential proxies but the cap is a defensive ceiling.",
                        "default": 5
                    },
                    "requestTimeoutSecs": {
                        "title": "Per-request timeout (seconds)",
                        "minimum": 10,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Wall-clock timeout for one HTTP fetch.",
                        "default": 45
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "RESIDENTIAL is recommended; PH's WAF rate-limits aggressively on datacenter IPs.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
