# Hacker News Search Scraper Stories, Comments, Show HN, Ask HN (`seemuapps/hn-search-scraper`) Actor

Search Hacker News stories, comments, Show HN, Ask HN, polls, and jobs by keyword, author, date range, points, and comment count. Full text and engagement metrics. No login.

- **URL**: https://apify.com/seemuapps/hn-search-scraper.md
- **Developed by:** [Andrew](https://apify.com/seemuapps) (community)
- **Categories:** News, Lead generation, Open source
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 hn item returneds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Hacker News Search Scraper — Stories, Comments, Show HN, Ask HN

Search Hacker News stories, comments, Show HN, Ask HN, polls, and jobs by keyword, author, date range, points, and comment count. Full text and engagement metrics. No login.

### What you get

- Item ID, type (story / comment / show_hn / ask_hn / poll / job), title, URL
- Author, points, comment count, and full body text
- For comments: parent story ID, story title, story URL — perfect for thread reconstruction
- Created timestamp (ISO + Unix), HN permalink, and full tag list
- Filter by date range, minimum points, minimum comments, and author
- Sort by relevance (most upvoted matches) or date (newest first)
- Direct export to JSON, CSV, Excel, or Google Sheets

### Use cases

- Mention monitoring — track every HN post that mentions your product, brand, or competitor
- Market signal — spot rising tools, libraries, or trends before they hit the front page
- Author research — pull every Show HN, Ask HN, or comment from a specific user
- Topic datasets — build a corpus of HN discussions on a niche topic for analysis or LLM fine-tuning
- Lead generation — find founders posting about problems your product solves

### How to use

1. Enter a **Search query** (e.g. `openai`, `rust web framework`) — leave empty to fetch the latest items in the chosen tag
2. Pick an **Item type** — Stories, Comments, Show HN, Ask HN, Polls, Jobs, Front Page Stories, or Any
3. Optionally filter by **Author**, **Created after**, **Created before**, **Minimum points**, or **Minimum comments**
4. Choose **Sort** — Relevance (popular matches first) or Date (newest first)
5. Set **Max items** (default 100; 0 for unlimited)
6. Run the actor — one item per row in the **Dataset** tab
7. To fetch more results, open the **Key-value store** tab → copy the `NEXT_PAGE_ID` value → paste it into **Page ID** on your next run

### Output format

One HN item per dataset row — perfect for direct CSV, Excel, or Google Sheets export:

```json
{
  "objectID": "12345678",
  "type": "story",
  "title": "Show HN: My new project",
  "url": "https://example.com/project",
  "author": "exampleuser",
  "points": 234,
  "numComments": 87,
  "storyText": null,
  "commentText": null,
  "storyId": null,
  "storyTitle": null,
  "storyUrl": null,
  "parentId": null,
  "createdAt": "2026-05-12T10:00:00.000Z",
  "createdAtUnix": 1762900000,
  "hnUrl": "https://news.ycombinator.com/item?id=12345678",
  "tags": ["story", "author_exampleuser", "show_hn"]
}
````

For a **comment** row: `commentText` is populated, plus `storyId`, `storyTitle`, `storyUrl`, and `parentId` link back to the parent thread.

### Pagination

Algolia HN caps relevance results at ~1000 items; date sort returns far more. The actor saves a resume cursor — the next page index — to the default **Key-value store** under `NEXT_PAGE_ID`.

1. Open the **Key-value store** tab on the run page
2. Copy the value of `NEXT_PAGE_ID`
3. Start a new run and paste it into **Page ID**

When `NEXT_PAGE_ID` is `null`, all matching results have been fetched.

### Input options

| Field | Type | Description |
|-------|------|-------------|
| Search query | string | Keyword(s) — leave empty for "all items in this tag" |
| Item type | enum | Stories, Comments, Show HN, Ask HN, Polls, Jobs, Front Page, Any |
| Author | string | HN username (case-sensitive) |
| Sort by | enum | Relevance or Date |
| Created after | string | YYYY-MM-DD UTC |
| Created before | string | YYYY-MM-DD UTC |
| Minimum points | integer | Story points ≥ N |
| Minimum comments | integer | Story comments ≥ N |
| Max items | integer | Cap per run — default 100, 0 for unlimited |
| Page ID | string | `NEXT_PAGE_ID` from the previous run, to resume pagination |

# Actor input Schema

## `query` (type: `string`):

Keyword(s) to search across titles, URLs, story text, and comment text. Leave empty to fetch the most recent items in the chosen tag.

## `tag` (type: `string`):

Restrict results to a single item type. 'Front Page Stories' = stories that have appeared on the HN front page.

## `author` (type: `string`):

Optional. HN username to restrict results to (case-sensitive). Leave empty to search all authors.

## `sortBy` (type: `string`):

Relevance favors highly-upvoted matches; Date returns the most recent items first.

## `dateAfter` (type: `string`):

Optional. Only return items created on or after this UTC date.

## `dateBefore` (type: `string`):

Optional. Only return items created on or before this UTC date.

## `minPoints` (type: `integer`):

Only return stories with at least this many points. Ignored for comments.

## `minComments` (type: `integer`):

Only return stories with at least this many comments. Ignored for comments.

## `maxItems` (type: `integer`):

Maximum results to return. 0 = no cap. Algolia caps relevance results at ~1000; date sort can return more.

## `pageId` (type: `string`):

Optional. Paste the NEXT\_PAGE\_ID from a previous run's Key-value store to resume from where you left off.

## Actor input object example

```json
{
  "query": "openai",
  "tag": "story",
  "sortBy": "relevance",
  "minPoints": 0,
  "minComments": 0,
  "maxItems": 100
}
```

# Actor output Schema

## `results` (type: `string`):

One row per HN item: objectID, type, title, url, author, points, numComments, storyText, commentText, parent story info, createdAt, hnUrl, tags.

## `nextPageId` (type: `string`):

NEXT\_PAGE\_ID record in the default key-value store. Paste into Page ID on the next run to resume; null when results are exhausted.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "openai",
    "author": "",
    "dateAfter": "",
    "dateBefore": ""
};

// Run the Actor and wait for it to finish
const run = await client.actor("seemuapps/hn-search-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "query": "openai",
    "author": "",
    "dateAfter": "",
    "dateBefore": "",
}

# Run the Actor and wait for it to finish
run = client.actor("seemuapps/hn-search-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "openai",
  "author": "",
  "dateAfter": "",
  "dateBefore": ""
}' |
apify call seemuapps/hn-search-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=seemuapps/hn-search-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hacker News Search Scraper Stories, Comments, Show HN, Ask HN",
        "description": "Search Hacker News stories, comments, Show HN, Ask HN, polls, and jobs by keyword, author, date range, points, and comment count. Full text and engagement metrics. No login.",
        "version": "1.0",
        "x-build-id": "sHJQUDKDo5XH5WXXh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/seemuapps~hn-search-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-seemuapps-hn-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/seemuapps~hn-search-scraper/runs": {
            "post": {
                "operationId": "runs-sync-seemuapps-hn-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/seemuapps~hn-search-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-seemuapps-hn-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Keyword(s) to search across titles, URLs, story text, and comment text. Leave empty to fetch the most recent items in the chosen tag."
                    },
                    "tag": {
                        "title": "Item type",
                        "enum": [
                            "any",
                            "story",
                            "comment",
                            "show_hn",
                            "ask_hn",
                            "poll",
                            "job",
                            "front_page"
                        ],
                        "type": "string",
                        "description": "Restrict results to a single item type. 'Front Page Stories' = stories that have appeared on the HN front page.",
                        "default": "story"
                    },
                    "author": {
                        "title": "Author",
                        "type": "string",
                        "description": "Optional. HN username to restrict results to (case-sensitive). Leave empty to search all authors."
                    },
                    "sortBy": {
                        "title": "Sort by",
                        "enum": [
                            "relevance",
                            "date"
                        ],
                        "type": "string",
                        "description": "Relevance favors highly-upvoted matches; Date returns the most recent items first.",
                        "default": "relevance"
                    },
                    "dateAfter": {
                        "title": "Created after (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Optional. Only return items created on or after this UTC date."
                    },
                    "dateBefore": {
                        "title": "Created before (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Optional. Only return items created on or before this UTC date."
                    },
                    "minPoints": {
                        "title": "Minimum points",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Only return stories with at least this many points. Ignored for comments.",
                        "default": 0
                    },
                    "minComments": {
                        "title": "Minimum comments",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Only return stories with at least this many comments. Ignored for comments.",
                        "default": 0
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum results to return. 0 = no cap. Algolia caps relevance results at ~1000; date sort can return more.",
                        "default": 100
                    },
                    "pageId": {
                        "title": "Page ID (pagination)",
                        "type": "string",
                        "description": "Optional. Paste the NEXT_PAGE_ID from a previous run's Key-value store to resume from where you left off."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
