# Product Hunt Scraper (`glassventures/product-hunt-scraper`) Actor

Scrape Product Hunt launches and products. Extract names, taglines, votes, comments, makers, topics, ratings. Export to JSON, CSV, Excel.

- **URL**: https://apify.com/glassventures/product-hunt-scraper.md
- **Developed by:** [Glass Ventures](https://apify.com/glassventures) (community)
- **Categories:** Lead generation, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Product Hunt Scraper

Scrape products and launches from Product Hunt. Extract names, taglines, votes, comments, makers, topics, ratings, and more.

### What does Product Hunt Scraper do?

Product Hunt Scraper is an Apify actor that extracts product launch data from Product Hunt, the leading platform where startups and makers showcase their products. It collects comprehensive data about products including names, taglines, descriptions, vote counts, comment counts, maker information, topics, and ratings.

The actor supports scraping from the homepage (daily launches), search results, topic pages, and individual product pages. It uses CheerioCrawler for fast, efficient HTTP-based scraping by extracting embedded __NEXT_DATA__ and Apollo cache from Product Hunt pages.

Whether you need to track trending products, analyze startup launches, or monitor competitors, this actor provides structured data ready for analysis.

### Use Cases

- **Market researchers** — Track trending products and startup launches across categories
- **Venture capitalists** — Monitor new product launches and identify promising startups early
- **Product managers** — Analyze competitor products, features, and community reception
- **Data analysts** — Build datasets of product launches for trend analysis and market insights
- **Developers** — Discover new developer tools and APIs as they launch

### Features

- Scrape Product Hunt homepage, search results, topic pages, and individual product pages
- Extract comprehensive product data: name, tagline, votes, comments, makers, topics, rating
- Automatic URL classification — detects page type from URL pattern
- Multiple data extraction strategies: __NEXT_DATA__, Apollo cache, HTML fallback
- Proxy support with automatic session rotation
- Deduplication to avoid duplicate products in output
- Handles pagination and large datasets automatically
- Exports to JSON, CSV, Excel, or connect via API

### How much will it cost?

| Results | Estimated Cost |
|---------|---------------|
| 100     | ~$0.10        |
| 1,000   | ~$0.50        |
| 10,000  | ~$3.00        |

| Cost Component | Per 1,000 Results |
|----------------|-------------------|
| Platform compute | ~$0.25 |
| Proxy (datacenter) | ~$0.25 |
| **Total** | **~$0.50** |

### How to use

1. Go to the Product Hunt Scraper page on Apify Store
2. Click "Start" or "Try for free"
3. Enter Product Hunt URLs, search terms, or topics
4. Set the maximum number of items
5. Click "Start" and wait for the results

### Input parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| startUrls | array | Product Hunt URLs to scrape | - |
| searchTerms | array | Search queries to find products | - |
| topics | array | Topic slugs (e.g., "artificial-intelligence") | - |
| maxItems | number | Max results to return | 100 |
| maxConcurrency | number | Parallel page limit | 10 |
| debugMode | boolean | Enable verbose logging | false |
| proxyConfig | object | Proxy settings | Apify Proxy |

### Output

The actor produces a dataset with the following fields:

```json
{
    "name": "ChatGPT",
    "tagline": "An AI-powered chatbot by OpenAI",
    "description": "ChatGPT is a conversational AI model...",
    "url": "https://www.producthunt.com/posts/chatgpt",
    "websiteUrl": "https://chat.openai.com",
    "votesCount": 3200,
    "commentsCount": 450,
    "reviewsCount": 120,
    "rating": 4.8,
    "topics": ["Artificial Intelligence", "Productivity"],
    "makers": [
        {
            "name": "Sam Altman",
            "username": "sama",
            "profileUrl": "https://www.producthunt.com/@sama"
        }
    ],
    "thumbnailUrl": "https://ph-files.imgix.net/...",
    "launchDate": "2022-12-01T00:00:00.000Z",
    "isFeatured": true,
    "scrapedAt": "2024-01-15T10:30:00.000Z"
}
````

| Field | Type | Description |
|-------|------|-------------|
| name | string | Product name |
| tagline | string | Product tagline |
| description | string | Product description |
| url | string | Product Hunt page URL |
| websiteUrl | string | Product website URL |
| votesCount | integer | Number of upvotes |
| commentsCount | integer | Number of comments |
| reviewsCount | integer | Number of reviews |
| rating | number | Average rating |
| topics | array | Product topics/categories |
| makers | array | Product makers with name, username, profile URL |
| thumbnailUrl | string | Product thumbnail image URL |
| launchDate | string | Date the product was launched |
| isFeatured | boolean | Whether the product was featured |
| scrapedAt | string | ISO 8601 scrape timestamp |

### Integrations

Connect Product Hunt Scraper with other tools:

- **Apify API** — REST API for programmatic access
- **Webhooks** — get notified when a run finishes
- **Zapier / Make** — connect to 5,000+ apps
- **Google Sheets** — export directly to spreadsheets

#### API Example (Node.js)

```javascript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('YOUR_USERNAME/product-hunt-scraper').call({
    startUrls: [{ url: 'https://www.producthunt.com/' }],
    maxItems: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
```

#### API Example (Python)

```python
from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('YOUR_USERNAME/product-hunt-scraper').call(run_input={
    'startUrls': [{'url': 'https://www.producthunt.com/'}],
    'maxItems': 100,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
```

#### API Example (cURL)

```bash
curl "https://api.apify.com/v2/acts/YOUR_USERNAME~product-hunt-scraper/runs" \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"startUrls": [{"url": "https://www.producthunt.com/"}], "maxItems": 100}'
```

### Tips and tricks

- Start with a small `maxItems` (10-20) to test before running large scrapes
- Use topics to scrape products in specific categories like "artificial-intelligence" or "developer-tools"
- Combine search terms and topics for comprehensive data collection
- The actor automatically deduplicates products across different input sources

### FAQ

**Q: Does this actor require login credentials?**
A: No. Product Hunt product data is publicly accessible without authentication.

**Q: How fast is the scraping?**
A: Approximately 50-100 products per minute using CheerioCrawler with default concurrency.

**Q: What should I do if I get blocked?**
A: Switch to residential proxies in the Proxy Configuration settings and lower the concurrency.

**Q: Can I scrape specific topics?**
A: Yes. Use the Topics input with slugs like "artificial-intelligence", "saas", "developer-tools".

### Is it legal to scrape Product Hunt?

Web scraping of publicly available data is generally legal based on precedents like the LinkedIn v. HiQ Labs case. This actor only accesses publicly available data. Always review and respect the target site's Terms of Service and robots.txt. For more information, see [Apify's blog on web scraping legality](https://blog.apify.com/is-web-scraping-legal/).

### Limitations

- Product Hunt may change their page structure, which could require actor updates
- Rate limiting may apply — use proxies for large-scale scraping
- Some detailed product information may only be available on individual product pages
- Historical data beyond what's currently visible on the site is not accessible

### Changelog

- **v0.1** (2026-04-23) — Initial release

# Actor input Schema

## `startUrls` (type: `array`):

Product Hunt URLs to scrape. Can be homepage, search pages, topic pages, or individual product pages.

## `searchTerms` (type: `array`):

Search Product Hunt for products matching these terms.

## `topics` (type: `array`):

Product Hunt topic slugs to scrape (e.g., "artificial-intelligence", "developer-tools", "saas").

## `maxItems` (type: `integer`):

Maximum number of products to scrape. Use 0 or leave empty for unlimited.

## `maxConcurrency` (type: `integer`):

Maximum number of pages processed in parallel.

## `debugMode` (type: `boolean`):

Enables verbose logging and saves HTML snapshots on errors.

## `extendOutputFunction` (type: `string`):

A JavaScript function to customize each output item. Receives { data, $, request }.

## `proxyConfig` (type: `object`):

Select proxies to be used. Recommended for avoiding rate limits.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.producthunt.com/"
    }
  ],
  "maxItems": 100,
  "maxConcurrency": 10,
  "debugMode": false,
  "extendOutputFunction": "async ({ data, $ }) => {\n    return data;\n}",
  "proxyConfig": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.producthunt.com/"
        }
    ],
    "extendOutputFunction": async ({ data, $ }) => {
        return data;
    },
    "proxyConfig": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("glassventures/product-hunt-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.producthunt.com/" }],
    "extendOutputFunction": """async ({ data, $ }) => {
    return data;
}""",
    "proxyConfig": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("glassventures/product-hunt-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.producthunt.com/"
    }
  ],
  "extendOutputFunction": "async ({ data, $ }) => {\\n    return data;\\n}",
  "proxyConfig": {
    "useApifyProxy": true
  }
}' |
apify call glassventures/product-hunt-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=glassventures/product-hunt-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Product Hunt Scraper",
        "description": "Scrape Product Hunt launches and products. Extract names, taglines, votes, comments, makers, topics, ratings. Export to JSON, CSV, Excel.",
        "version": "0.1",
        "x-build-id": "GzFV37f4QoG8eg5r2"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/glassventures~product-hunt-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-glassventures-product-hunt-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/glassventures~product-hunt-scraper/runs": {
            "post": {
                "operationId": "runs-sync-glassventures-product-hunt-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/glassventures~product-hunt-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-glassventures-product-hunt-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Product Hunt URLs to scrape. Can be homepage, search pages, topic pages, or individual product pages.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "searchTerms": {
                        "title": "Search Terms",
                        "type": "array",
                        "description": "Search Product Hunt for products matching these terms.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "topics": {
                        "title": "Topics",
                        "type": "array",
                        "description": "Product Hunt topic slugs to scrape (e.g., \"artificial-intelligence\", \"developer-tools\", \"saas\").",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of products to scrape. Use 0 or leave empty for unlimited.",
                        "default": 100
                    },
                    "maxConcurrency": {
                        "title": "Max Concurrency",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of pages processed in parallel.",
                        "default": 10
                    },
                    "debugMode": {
                        "title": "Debug Mode",
                        "type": "boolean",
                        "description": "Enables verbose logging and saves HTML snapshots on errors.",
                        "default": false
                    },
                    "extendOutputFunction": {
                        "title": "Extend Output Function",
                        "type": "string",
                        "description": "A JavaScript function to customize each output item. Receives { data, $, request }."
                    },
                    "proxyConfig": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Select proxies to be used. Recommended for avoiding rate limits."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
