# Show HN Lead Scraper (B2B Startup Discovery) (`fetchcraft/show-hn-lead-scraper`) Actor

Pulls Show HN, Launch HN, and Ask HN posts from Hacker News. Extracts company URLs, titles, upvotes, dates. Built to feed into the AI Sales Personalizer for personalized cold outreach. Free preview (10 results). $0.01 per lead after that.

- **URL**: https://apify.com/fetchcraft/show-hn-lead-scraper.md
- **Developed by:** [Emily Ward](https://apify.com/fetchcraft) (community)
- **Categories:** Lead generation, Business, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

$50.00 / 1,000 show hn lead returneds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Show HN Lead Scraper

**Pulls Show HN, Launch HN, and Ask HN posts from Hacker News. Returns a clean CSV of company URLs ready to feed into the AI Sales Personalizer for personalized outreach.**

Built as the companion to the AI Sales Personalizer actor. Together they form a complete B2B lead pipeline: discover startups on HN, personalize cold openers per company.

### Why this is useful

Hacker News Show HN and Launch HN posts are the highest-density B2B startup discovery feed available. Every weekday, founders post fresh products with descriptions, links, and an upvote signal that reflects real audience interest.

The output is CSV-ready for:

- **B2B SDR teams** targeting tech startups (the people on HN actively want to be reached)
- **VC scouts** tracking new launches in their thesis areas
- **Recruiters** finding companies that are hiring
- **Partnership teams** at SaaS companies looking for integration targets
- **Content marketers** sourcing customer interviews and case studies

### What you get back per result

| Field | Description |
| --- | --- |
| `title` | Full HN post title (e.g. "Show HN: AcmeAI, real-time agent orchestration") |
| `url` | The product / company URL the poster linked to |
| `domain` | Hostname extracted from the URL |
| `points` | Upvote count at time of scrape |
| `num_comments` | Comment count (engagement signal) |
| `created_at` | When the post went live |
| `type` | Show HN, Launch HN, or Ask HN |
| `hn_url` | Link to the HN thread (read the comments for free intel) |
| `author` | The HN username of the poster |

### Pricing

- **Preview mode**: $0 (returns 10 results max, no charges, use to verify the actor)
- **Standard**: $0.01 per lead returned (charged via `lead_returned` event)

Typical campaigns:
- 100 fresh launches: $1
- 500 launches across 30 days: $5
- 2,000 launches across 90 days: $20

Compare to manual scraping (3-4 hours of work) or a Crunchbase Pro subscription ($199/month).

### How to use it

1. Click **Try for free** with `preview_mode` ticked.
2. Set `post_types` to ["Show HN", "Launch HN"] for B2B product launches. Add "Ask HN" if you want help-seeking posts (often founders surfacing real problems).
3. Set `days_back` to 14 for fresh, 90 for retrospective batch runs.
4. Set `min_upvotes` to filter quality. 5 = recent + some traction. 50 = popular. 100+ = front-page.
5. (Optional) `include_keywords`: only include posts mentioning specific terms ("ai", "saas", "developer tool").
6. (Optional) `exclude_keywords`: skip "crypto", "web3", "nft" if those are not your ICP.
7. (Optional) `max_results`: hard cap.
8. Hit Start. Results stream into the dataset.

### The pipeline: combine with AI Sales Personalizer

This actor returns URLs. The AI Sales Personalizer turns each URL into a personalized cold-outreach opener.

The full pipeline:

1. Run **Show HN Lead Scraper** with `post_types: ["Show HN", "Launch HN"]`, `days_back: 14`, `min_upvotes: 25`. Get back 50 to 200 fresh launches.
2. Export the `url` column as CSV.
3. Feed that CSV into **AI Sales Personalizer** with your product pitch.
4. Get back 50 to 200 personalized openers, one per HN launch.
5. Push into Smartlead or your sequencer for sending.

Total cost: roughly $2 (Show HN scrape) + $30 (200 personalizations at $0.15 each) = $32. For 200 fresh, personalized B2B leads.

Compare to Clay's $149/month or an SDR's $4,500/month, and only $32 buys you exactly what you need.

### What this actor does NOT do

- It does not extract email addresses from the linked sites. Use Apollo, Clearbit, or Hunter for that.
- It does not scrape comments. The thread URL is provided so you can read them yourself.
- It does not produce personalized openers. Use the AI Sales Personalizer for that.
- It does not send emails. Use Smartlead, Lemlist, or Instantly.

### Tips

1. **Run weekly with `days_back: 7`** to maintain a fresh pipeline of new B2B targets.
2. **Use `min_upvotes: 25`** as the default filter. Below 25, the noise-to-signal ratio is high. Above 100, you compete with everyone who reads HN.
3. **For VC/scout use cases, drop `min_upvotes` to 1** to catch early launches before they go viral.
4. **Pair with `include_keywords`** for niche targeting. Example: `"developer tool, devops, observability"` to focus on infrastructure plays.

### FAQ

**Q: Is this allowed by HN?**
The official HN Algolia API is public, documented, and rate-limited. We respect the rate limits.

**Q: Are these high-quality leads?**
Show HN and Launch HN are some of the highest-signal launch feeds available, by design. Founders self-select to be discovered. That said, like all lead sources, you should still personalize and segment.

**Q: Refund policy?**
Apify's standard refund policy. Preview mode is free so you can verify quality before paying. Failed scrapes (HN API down) are not charged.

### Tags

`leads` `lead-generation` `hacker-news` `show-hn` `launch-hn` `b2b` `startup-discovery` `sales` `sdr` `outbound`

---

Built by Emily Ward, Sydney. Pairs natively with the AI Sales Personalizer actor.

# Actor input Schema

## `post_types` (type: `array`):

Which kinds of Hacker News posts to scrape. Show HN and Launch HN are the highest signal for B2B product discovery.
## `days_back` (type: `integer`):

How many days of HN posts to scan. Recent posts (last 7 to 30 days) are usually the best leads.
## `min_upvotes` (type: `integer`):

Filter out posts with fewer upvotes than this. 5 = recent + traction. 50 = popular launches. 100 = front-page material.
## `max_results` (type: `integer`):

Stop after this many leads. Hard cap.
## `include_keywords` (type: `string`):

Only include posts whose title contains one of these keywords. Leave blank for all. Example: "ai, saas, dev tool"
## `exclude_keywords` (type: `string`):

Skip posts whose title contains any of these. Example: "crypto, web3, nft"
## `preview_mode` (type: `boolean`):

If true, returns at most 10 results and does not charge. Use this to verify the actor works before running a real job.

## Actor input object example

```json
{
  "post_types": [
    "Show HN",
    "Launch HN"
  ],
  "days_back": 14,
  "min_upvotes": 5,
  "max_results": 200,
  "include_keywords": "",
  "exclude_keywords": "",
  "preview_mode": false
}
````

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "post_types": [
        "Show HN",
        "Launch HN"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("fetchcraft/show-hn-lead-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "post_types": [
        "Show HN",
        "Launch HN",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("fetchcraft/show-hn-lead-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "post_types": [
    "Show HN",
    "Launch HN"
  ]
}' |
apify call fetchcraft/show-hn-lead-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=fetchcraft/show-hn-lead-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Show HN Lead Scraper (B2B Startup Discovery)",
        "description": "Pulls Show HN, Launch HN, and Ask HN posts from Hacker News. Extracts company URLs, titles, upvotes, dates. Built to feed into the AI Sales Personalizer for personalized cold outreach. Free preview (10 results). $0.01 per lead after that.",
        "version": "0.1",
        "x-build-id": "JX29YGLRd0LtXaETl"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/fetchcraft~show-hn-lead-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-fetchcraft-show-hn-lead-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/fetchcraft~show-hn-lead-scraper/runs": {
            "post": {
                "operationId": "runs-sync-fetchcraft-show-hn-lead-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/fetchcraft~show-hn-lead-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-fetchcraft-show-hn-lead-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "post_types": {
                        "title": "Post types to include",
                        "type": "array",
                        "description": "Which kinds of Hacker News posts to scrape. Show HN and Launch HN are the highest signal for B2B product discovery.",
                        "default": [
                            "Show HN",
                            "Launch HN"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "days_back": {
                        "title": "Days back",
                        "minimum": 1,
                        "maximum": 365,
                        "type": "integer",
                        "description": "How many days of HN posts to scan. Recent posts (last 7 to 30 days) are usually the best leads.",
                        "default": 14
                    },
                    "min_upvotes": {
                        "title": "Minimum upvotes",
                        "minimum": 0,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Filter out posts with fewer upvotes than this. 5 = recent + traction. 50 = popular launches. 100 = front-page material.",
                        "default": 5
                    },
                    "max_results": {
                        "title": "Max results",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Stop after this many leads. Hard cap.",
                        "default": 200
                    },
                    "include_keywords": {
                        "title": "Title must contain (comma-separated, optional)",
                        "type": "string",
                        "description": "Only include posts whose title contains one of these keywords. Leave blank for all. Example: \"ai, saas, dev tool\"",
                        "default": ""
                    },
                    "exclude_keywords": {
                        "title": "Title must NOT contain (comma-separated, optional)",
                        "type": "string",
                        "description": "Skip posts whose title contains any of these. Example: \"crypto, web3, nft\"",
                        "default": ""
                    },
                    "preview_mode": {
                        "title": "Preview mode (free, 10 results max)",
                        "type": "boolean",
                        "description": "If true, returns at most 10 results and does not charge. Use this to verify the actor works before running a real job.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
