# Reddit Scraper — Posts, Comments, Any Subreddit, JSON (`knotless_cadence/reddit-scraper-pro`) Actor

Scrape Reddit posts and comments from any subreddit. Get title author score comments URLs. Export JSON CSV. No Reddit API key needed. Market research sentiment analysis trend monitoring. Email spinov001@gmail.com. Tips t.me/scraping\_ai

- **URL**: https://apify.com/knotless\_cadence/reddit-scraper-pro.md
- **Developed by:** [Alex](https://apify.com/knotless_cadence) (community)
- **Categories:** Social media, Marketing
- **Stats:** 3 total users, 1 monthly users, 88.9% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Scraper Pro — API-Based, Never Breaks on Redesigns

The most reliable Reddit scraper on Apify. Uses **Reddit's native JSON API** instead of HTML parsing — so it never breaks when Reddit updates their UI.

### Why This Scraper?

Most Reddit scrapers use HTML/CSS selectors that break every time Reddit changes their design. This scraper uses **Reddit's official JSON endpoint** (`/r/subreddit.json`) — the same data format Reddit's own apps use. This means:

- ✅ **Never breaks on redesigns** — JSON API is separate from the UI
- ✅ **Complete data** — 20+ fields per post, full comment trees
- ✅ **Structured output** — clean JSON, no HTML parsing artifacts
- ✅ **No login required** — public data, no credentials needed
- ✅ **Built-in rate limiting** — respects Reddit's API limits, won't get you banned

### Features

- **20+ data fields per post** — title, author, score, upvote ratio, comment count, flair, awards, URL, self text, link URL, domain, NSFW flag, stickied status, and more
- **Full comment threads** — nested comments with author, score, depth level, and creation date
- **Multiple subreddits** — scrape `r/programming`, `r/datascience`, `r/Entrepreneur` in one run
- **Cross-Reddit search** — find posts by keyword across all of Reddit
- **Flexible sorting** — hot, new, top, rising with time filters (hour/day/week/month/year)
- **Automatic pagination** — follows Reddit's cursor-based pagination for 500+ posts
- **Proxy support** — uses Apify Proxy (residential) for reliable access

### Output Data (20+ fields)

```json
{
  "id": "1b2c3d4",
  "title": "What tools do you use for market research?",
  "author": "startup_founder",
  "subreddit": "Entrepreneur",
  "score": 847,
  "upvoteRatio": 0.94,
  "numComments": 234,
  "createdUtc": "2026-03-17T15:30:00.000Z",
  "url": "https://reddit.com/r/Entrepreneur/comments/...",
  "selfText": "I've been looking for affordable tools...",
  "linkUrl": "https://example.com/article",
  "flair": "Discussion",
  "awards": 3,
  "isNSFW": false,
  "isStickied": false,
  "domain": "self.Entrepreneur",
  "thumbnail": "https://...",
  "comments": [
    {
      "id": "abc123",
      "author": "data_analyst",
      "body": "I use a combination of...",
      "score": 156,
      "createdUtc": "2026-03-17T16:00:00.000Z",
      "depth": 0
    }
  ]
}
````

### Use Cases

- **Market research** — discover what people say about your product, brand, or industry
- **Sentiment analysis** — collect posts and comments for NLP models
- **AI training data** — build datasets from Reddit discussions for LLM fine-tuning
- **Trend monitoring** — track emerging topics and viral content in real-time
- **Competitive intelligence** — monitor competitor mentions and complaints
- **Content research** — find top questions and topics your audience cares about
- **Lead generation** — identify users asking for your type of product/service
- **Academic research** — gather social media data for papers and studies

### Input Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `subreddits` | Array | `[]` | Subreddit names (e.g., `["technology", "startups"]`) |
| `searchQueries` | Array | `[]` | Search terms across all of Reddit |
| `maxPostsPerSource` | Number | `50` | Max posts per subreddit/query (1-500) |
| `includeComments` | Boolean | `true` | Extract comment threads |
| `maxCommentsPerPost` | Number | `20` | Max comments per post |
| `sortBy` | String | `"hot"` | Sort: hot, new, top, rising |
| `timeFilter` | String | `"week"` | Time filter: hour, day, week, month, year |

### Technical Details

- **Method:** Reddit JSON API (`/r/subreddit.json`, `/search.json`)
- **Proxy:** Apify residential proxy for reliable access
- **Rate limiting:** Built-in delays between requests (2-3 seconds)
- **Pagination:** Cursor-based (Reddit's `after` parameter)
- **Error handling:** Graceful handling of 403/429 errors with retry logic

### Cost Estimation

- \~$0.50 per 100 posts without comments
- \~$1.00 per 100 posts with full comment threads
- Free tier available with Apify free plan

### FAQ

**Q: Why JSON API instead of HTML scraping?**
A: HTML scrapers break every time Reddit updates their design. The JSON API returns structured data in a format that hasn't changed in years. It's the same API Reddit's mobile app uses.

**Q: Can I scrape private subreddits?**
A: No — only publicly accessible subreddits. This scraper uses public endpoints.

**Q: Does it need my Reddit credentials?**
A: No. All data is fetched from public JSON endpoints.

**Q: How many posts can I get per run?**
A: Up to 500 posts per subreddit with pagination. Multiple subreddits can be scraped in one run.

***

*Part of 60+ data tools by knotless\_cadence on Apify. Related tools:*

- [Social Profile Scraper](https://apify.com/knotless_cadence/social-profile-scraper) — Extract public profile data from social media platforms
- [MCP Social Monitor](https://apify.com/knotless_cadence/mcp-social-monitor) — AI-powered social media monitoring and sentiment tracking
- [Threads Scraper](https://apify.com/knotless_cadence/threads-scraper) — Scrape posts, profiles, and replies from Meta Threads

### More Tools

- [60+ free scrapers](https://github.com/spinov001-art/awesome-web-scraping-2026)
- [15 MCP Servers for AI Agents](https://github.com/spinov001-art/mcp-servers-collection)
- [Market Research Reports](https://link.payoneer.com/Token?t=E82590E5D2534557BF2FDBD721411A64\&src=pl)

# Actor input Schema

## `subreddits` (type: `array`):

List of subreddits to scrape (e.g., 'technology', 'r/startups', 'programming')

## `searchQueries` (type: `array`):

Search Reddit for specific topics (e.g., 'best CRM for startups', 'AI tools 2026')

## `maxPostsPerSource` (type: `integer`):

Maximum posts to extract per subreddit or search query

## `includeComments` (type: `boolean`):

Extract comments for each post

## `maxCommentsPerPost` (type: `integer`):

Maximum number of comments to extract per post

## `sortBy` (type: `string`):

Sort order for posts within a subreddit

## `timeFilter` (type: `string`):

Time range filter for posts (applies when sorting by 'top')

## Actor input object example

```json
{
  "subreddits": [],
  "searchQueries": [],
  "maxPostsPerSource": 50,
  "includeComments": true,
  "maxCommentsPerPost": 20,
  "sortBy": "hot",
  "timeFilter": "week"
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("knotless_cadence/reddit-scraper-pro").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("knotless_cadence/reddit-scraper-pro").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call knotless_cadence/reddit-scraper-pro --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=knotless_cadence/reddit-scraper-pro",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Scraper — Posts, Comments, Any Subreddit, JSON",
        "description": "Scrape Reddit posts and comments from any subreddit. Get title author score comments URLs. Export JSON CSV. No Reddit API key needed. Market research sentiment analysis trend monitoring. Email spinov001@gmail.com. Tips t.me/scraping_ai",
        "version": "1.0",
        "x-build-id": "UyWbuaFx39kXneUem"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/knotless_cadence~reddit-scraper-pro/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-knotless_cadence-reddit-scraper-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/knotless_cadence~reddit-scraper-pro/runs": {
            "post": {
                "operationId": "runs-sync-knotless_cadence-reddit-scraper-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/knotless_cadence~reddit-scraper-pro/run-sync": {
            "post": {
                "operationId": "run-sync-knotless_cadence-reddit-scraper-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "subreddits": {
                        "title": "Subreddits",
                        "type": "array",
                        "description": "List of subreddits to scrape (e.g., 'technology', 'r/startups', 'programming')",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchQueries": {
                        "title": "Search Queries",
                        "type": "array",
                        "description": "Search Reddit for specific topics (e.g., 'best CRM for startups', 'AI tools 2026')",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPostsPerSource": {
                        "title": "Max Posts Per Source",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum posts to extract per subreddit or search query",
                        "default": 50
                    },
                    "includeComments": {
                        "title": "Include Comments",
                        "type": "boolean",
                        "description": "Extract comments for each post",
                        "default": true
                    },
                    "maxCommentsPerPost": {
                        "title": "Max Comments Per Post",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of comments to extract per post",
                        "default": 20
                    },
                    "sortBy": {
                        "title": "Sort By",
                        "enum": [
                            "hot",
                            "new",
                            "top",
                            "rising"
                        ],
                        "type": "string",
                        "description": "Sort order for posts within a subreddit",
                        "default": "hot"
                    },
                    "timeFilter": {
                        "title": "Time Filter",
                        "enum": [
                            "hour",
                            "day",
                            "week",
                            "month",
                            "year",
                            "all"
                        ],
                        "type": "string",
                        "description": "Time range filter for posts (applies when sorting by 'top')",
                        "default": "week"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```