# Reddit Scraper — Posts & Comments (`signalengine/reddit-scraper`) Actor

Scrape posts and comments from any subreddit — no Reddit API key, no login, no proxy. A fast, free Reddit API alternative for public data, exported to JSON, CSV or Excel.

- **URL**: https://apify.com/signalengine/reddit-scraper.md
- **Developed by:** [James Taylor](https://apify.com/signalengine) (community)
- **Categories:** Developer tools, Lead generation, Social media
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 posts

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Scraper — Posts & Comments

This **Reddit scraper** pulls posts and their top-level comments from any subreddit — fast,
**login-free, no API key, no proxy**. Pick your subreddits and sort order, and export clean,
structured Reddit data to JSON, CSV, or Excel in one run.

It's the lightweight, **Reddit RSS scraper** built for researchers, marketers, data teams, and
builders who need Reddit posts and comments without accounts, OAuth, or rate-limit headaches.

### Lightweight Reddit scraper vs. the Comment Tree Scraper

This actor reads Reddit's public RSS feeds. That's what makes it fast and cheap — but RSS only
exposes text content (title, author, body, link, timestamp), **not** engagement counts or nested
reply threads. So `score` and `numComments` come back `null`, and comments are a **flat list of
top-level replies**.

Use this lightweight **subreddit scraper** for a quick, cheap pass over recent posts across many
subreddits — titles, bodies, authors, permalinks, and top-level comments as text, with no proxy
and no Reddit account.

Use the premium **[Reddit Comment Tree Scraper](https://apify.com/signalengine/reddit-deep-comments)**
(actor `reddit-deep-comments`) instead when you need **upvote scores**, **full nested comment
trees** (replies-to-replies, at depth), and richer engagement metrics that only its heavier,
residential-proxy-backed fetch can reach.

Same data shape, different depth — pick the one that matches the job.

### What it does

- Scrapes posts from one or more **subreddits**, sorted by `hot`, `new`, `rising`, or `top`.
- Optionally attaches each post's **top-level comments** (one extra request per post).
- Returns clean records — title, author, body, permalink, timestamp, and a nested `comments`
  array — ready to export to JSON/CSV/Excel or pull via the Apify API.
- Caps your spend with a hard `maxPosts` limit and stays polite with low default concurrency.

### Who it's for

- **Researchers & data teams** collecting subreddit text for analysis or datasets.
- **Marketers & community managers** tracking what people post and discuss in their niches.
- **Builders & founders** who want raw Reddit posts feeding their own pipeline — no login,
  proxy, or API key required.

### Input

| Field | Type | Default | Description |
|---|---|---|---|
| `subreddits` | array | `["SaaS"]` | Subreddits to scrape (with or without the `r/` prefix), e.g. `"SaaS"`, `"technology"`. Required. |
| `sort` | string | `hot` | Which listing to scrape: `hot`, `new`, `rising`, or `top`. |
| `maxPosts` | integer | `100` | Total posts to scrape across all subreddits (caps your spend). |
| `includeComments` | boolean | `true` | Fetch each post's top-level comments (one extra request per post). |
| `maxCommentsPerPost` | integer | `20` | Cap comments captured per post (0–100). |
| `maxConcurrency` | integer | `4` | Parallel requests (1–20, kept low to stay polite). |

#### Example input

```json
{
  "subreddits": ["SaaS", "Entrepreneur"],
  "sort": "hot",
  "maxPosts": 50,
  "includeComments": true,
  "maxCommentsPerPost": 25,
  "maxConcurrency": 4
}
````

### How to run

1. Click **Try for free** (or open the actor in your Apify Console).
2. Enter the **subreddits** you want to scrape and pick a **sort** (`new` surfaces the freshest
   posts; `top` and `hot` surface the most-seen).
3. Set **maxPosts** to cap your spend.
4. (Optional) Toggle **includeComments** and set **maxCommentsPerPost** to control how many
   top-level comments come back with each post.
5. Click **Start**. When the run finishes, open the **Dataset** tab and export to JSON/CSV/Excel,
   or pull it via the API (below).

Run it on a **schedule** (Apify Schedules) for a fresh pull every morning, or call it from
**Make / Zapier / n8n** via the Apify integrations.

### Output

Each dataset item is a post with its top-level comments:

```json
{
  "type": "post",
  "id": "1tuxy4e",
  "subreddit": "SaaS",
  "author": "Significant-Honey204",
  "title": "Can anyone help me get reviews on G2?",
  "body": "We just launched and...",
  "postUrl": "https://www.reddit.com/r/SaaS/comments/1tuxy4e/can_anyone_help_me_get_reviews_on_g2",
  "createdAt": "2026-06-02T09:12:00.000Z",
  "score": null,
  "numComments": null,
  "commentCount": 5,
  "comments": [
    {
      "author": "Impossible-Ebb-2446",
      "body": "Try asking in your onboarding emails.",
      "commentUrl": "https://www.reddit.com/r/SaaS/comments/1tuxy4e/_/opcuxfu",
      "createdAt": "2026-06-02T09:40:00.000Z"
    }
  ]
}
```

Field notes:

- **`title` / `body` / `author` / `postUrl` / `createdAt`** come straight from the post's RSS
  entry — exactly what Reddit publishes, never fabricated.
- **`commentCount`** is the number of top-level comments captured (capped by
  `maxCommentsPerPost`), and **`comments`** is a **flat array of top-level replies** — each with
  `author`, `body`, `commentUrl`, and `createdAt`.
- **`score` and `numComments` are always `null`.** RSS doesn't expose upvote counts or comment
  totals, so this actor returns `null` rather than guess. If you need **engagement scores or full
  nested comment trees**, use the
  **[Reddit Comment Tree Scraper](https://apify.com/signalengine/reddit-deep-comments)**
  (`reddit-deep-comments`) instead.

#### Export & API

```bash
## Last run's dataset items as JSON
curl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?format=json&token=<APIFY_TOKEN>"
```

Or use the **run-sync-get-dataset-items** endpoint to run-and-wait in a single call — handy for
embedding the actor in your own backend.

### Pricing

Apify Pay-Per-Event — you're charged per **post** returned (comments are included at no extra
charge). Set `maxPosts` to cap your spend.

### Limitations

- **No engagement counts.** RSS doesn't expose upvote `score` or `numComments`, so those are
  `null` — we never fabricate them.
- **Top-level comments, flat.** RSS returns top-level comments without nested reply trees or
  per-comment scores. For scores and full threaded trees, use the
  **[Reddit Comment Tree Scraper](https://apify.com/signalengine/reddit-deep-comments)**.
- **RSS depth.** Each subreddit feed returns its most recent ~25 posts; scan more subreddits or
  schedule runs for broader coverage rather than a full historical export.

### Compliance

This actor reads **public Reddit RSS only**, identifies itself with a descriptive User-Agent,
runs at modest concurrency, and never logs in, posts, votes, or messages. You are responsible for
using the exported Reddit data in line with Reddit's terms and any laws that apply to you.

### FAQ

**Do I need a Reddit account, API key, or proxy to use this Reddit scraper?** No. It reads
public Reddit RSS feeds with plain HTTP requests — no login, no OAuth, no API key, and no proxy.

**Why are `score` and `numComments` null?** Reddit only exposes upvote and comment counts on
endpoints it blocks to scrapers. This actor reads RSS, which doesn't include them, so it returns
`null` rather than guess.

**How do I get upvote scores or full comment trees?** Use the premium
**[Reddit Comment Tree Scraper](https://apify.com/signalengine/reddit-deep-comments)** (actor
`reddit-deep-comments`). It uses a heavier, residential-proxy-backed fetch to return upvote
scores and complete nested comment trees — the depth this lightweight RSS scraper can't reach.

**How is it priced and how do I control cost?** Apify Pay-Per-Event — you're charged per **post**
returned, with comments included free. Set `maxPosts` to cap your spend before each run.

**How many comments per post does it return?** As many top-level comments as you set in
`maxCommentsPerPost` (default 20, up to 100). Set it to 0, or turn off `includeComments`, to scrape
posts only.

**Which sort order should I pick?** Use `new` for the freshest posts, `hot` or `top` for the
most-engaged threads, and `rising` to catch posts gaining traction. The feed returns roughly the
most recent ~25 posts per subreddit.

**Can I export Reddit data to CSV or Excel, and how fresh is it?** Yes — every run's dataset
exports to JSON, CSV, or Excel from the Apify Console, or via the API. The actor reads the live
RSS feed on each run, so results reflect the subreddit at run time. Pair `sort: "new"` with an
Apify Schedule to catch posts as they appear.

***

#### Looking for *leads*, not raw data?

If you want buyer-intent posts turned into a lead list, see our **Reddit Lead Finder**. And if
you'd like the whole outbound loop automated — intent discovery, enrichment, AI-personalised
outreach, and reply handling — that's what we build at
**[SignalEngine](https://engine.signalsprint.io)**.

# Actor input Schema

## `subreddits` (type: `array`):

Subreddits to scrape (with or without the r/ prefix), e.g. "SaaS", "technology".

## `sort` (type: `string`):

Which listing to scrape.

## `maxPosts` (type: `integer`):

Total posts to scrape across all subreddits (caps your spend).

## `includeComments` (type: `boolean`):

Fetch each post's top-level comments (one extra request per post).

## `maxCommentsPerPost` (type: `integer`):

Cap comments captured per post.

## `maxConcurrency` (type: `integer`):

Parallel requests. Kept low by default to stay polite.

## Actor input object example

```json
{
  "subreddits": [
    "SaaS",
    "Entrepreneur"
  ],
  "sort": "hot",
  "maxPosts": 100,
  "includeComments": true,
  "maxCommentsPerPost": 20,
  "maxConcurrency": 4
}
```

# Actor output Schema

## `posts` (type: `string`):

Scraped posts with nested comments.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "subreddits": [
        "SaaS",
        "Entrepreneur"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("signalengine/reddit-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "subreddits": [
        "SaaS",
        "Entrepreneur",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("signalengine/reddit-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "subreddits": [
    "SaaS",
    "Entrepreneur"
  ]
}' |
apify call signalengine/reddit-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=signalengine/reddit-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Scraper — Posts & Comments",
        "description": "Scrape posts and comments from any subreddit — no Reddit API key, no login, no proxy. A fast, free Reddit API alternative for public data, exported to JSON, CSV or Excel.",
        "version": "0.1",
        "x-build-id": "LSbzDnK7ad8ITS5JT"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/signalengine~reddit-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-signalengine-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/signalengine~reddit-scraper/runs": {
            "post": {
                "operationId": "runs-sync-signalengine-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/signalengine~reddit-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-signalengine-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "subreddits"
                ],
                "properties": {
                    "subreddits": {
                        "title": "Subreddits",
                        "type": "array",
                        "description": "Subreddits to scrape (with or without the r/ prefix), e.g. \"SaaS\", \"technology\".",
                        "default": [
                            "SaaS"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "sort": {
                        "title": "Sort",
                        "enum": [
                            "hot",
                            "new",
                            "rising",
                            "top"
                        ],
                        "type": "string",
                        "description": "Which listing to scrape.",
                        "default": "hot"
                    },
                    "maxPosts": {
                        "title": "Max posts",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Total posts to scrape across all subreddits (caps your spend).",
                        "default": 100
                    },
                    "includeComments": {
                        "title": "Include comments",
                        "type": "boolean",
                        "description": "Fetch each post's top-level comments (one extra request per post).",
                        "default": true
                    },
                    "maxCommentsPerPost": {
                        "title": "Max comments per post",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Cap comments captured per post.",
                        "default": 20
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Parallel requests. Kept low by default to stay polite.",
                        "default": 4
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
