Under maintenance

Pricing

Pay per event

Try for free

Go to Apify Store

Reddit Posts & Subreddit Comment Scraper

Under maintenance

Try for free

Scrape Reddit posts and nested comment trees from specific subreddits. Proxy-aware fallback for the legacy public surface. Sort by hot, top, new, rising with optional comment depth control.

Pricing

Pay per event

Rating

0.0

(0)

Developer

naoki anzai

Actor stats

Bookmarked

Total users

Monthly active users

15 days ago

Last modified

💬 Reddit Scraper (Legacy Fallback)

Dive deep into niche communities with the Subreddit & Comment Scraper, a powerful extraction utility built to navigate and capture targeted discussions. Specifically maintained for proxy-sensitive environments, this tool serves as a reliable fallback for workflows that require robust IP management and older routing methods to bypass aggressive scraping countermeasures. It excels at scraping high-volume subreddits, pulling down both parent posts and the complex, nested comment trees that contain valuable user opinions and sentiment.

Research teams, community managers, and OSINT analysts utilize this scraper to conduct historical audits of specific subreddits, track viral topics, and analyze authentic user feedback. By specifying target subreddits and applying granular sort filters (such as Top of All Time or Newest), you can precisely control what data enters your pipeline. The tool bypasses the limitations of standard APIs, ensuring you get unfiltered access to community conversations.

Your resulting datasets will include rich, structured details: accurate timestamps, total upvotes, author handles, full comment bodies, and post URLs. This allows for seamless downstream analysis of social trends. Note that this is a legacy-focused actor prioritizing proxy-aware subreddit flows. If your primary goal involves setting up recurring keyword alerts across the entire site, or broad user-profile scraping, our newer Reddit All-in-One Scraper is recommended. However, for specialized subreddit extraction where you control the residential proxies and demand a straightforward, list-based collection method, this scraper remains a highly effective and fully maintained choice.

📄 Live sample output: see docs/sample-output.json for a representative dataset captured from a real run of this actor. Use it to validate the schema before subscribing.

Store Quickstart

Start with store-input.example.json or Legacy Quickstart (Proxy-aware). If running on Apify infrastructure, configure Residential proxy first.
Then use the legacy ladder from store-input.templates.json:
1. Legacy Quickstart (Proxy-aware)
2. Legacy Recurring Refresh (Proxy-aware)
3. Legacy Webhook Handoff (Proxy-aware)
Buyer-facing proof assets live in sample-output.example.json and live-proof.example.json.
New recurring or pack-first users should still move to reddit-all-in-one-scraper / reddit-keyword-monitor-alerts once the legacy need is proven.

Legacy Scope

Subreddit-based post scraping
Optional comment extraction
Basic sort/time controls
No recurring snapshot diff monitoring

Input

Field	Type	Default	Description
subreddits	string[]	(required)	Subreddit names (max 20)
sort	string	hot	hot, new, top, rising
maxItems	integer	25	Max posts per subreddit (1-500)
includeComments	boolean	false	Include nested comments

Input Example

{
  "subreddits": ["programming", "technology"],
  "sort": "hot",
  "maxItems": 50,
  "includeComments": true
}

Input Examples

Example: Top of all time in a subreddit

{
  "subreddits": [
    "DataIsBeautiful"
  ],
  "sort": "top",
  "time": "all",
  "maxPosts": 25,
  "includeComments": true,
  "commentDepth": 2
}

Example: Newest posts (multi-subreddit)

{
  "subreddits": [
    "MachineLearning",
    "datascience"
  ],
  "sort": "new",
  "maxPosts": 50,
  "includeComments": false
}

Example: Specific post + comment tree

{
  "posts": [
    "https://old.reddit.com/r/programming/comments/abc123/"
  ],
  "includeComments": true,
  "commentDepth": 5
}

Output

Field	Type	Description
`id`	string	Reddit post ID
`title`	string	Post title
`author`	string	Username of poster
`subreddit`	string	Subreddit name
`url`	string	Permalink to post
`score`	integer	Upvote score
`numComments`	integer	Comment count
`createdAt`	string	ISO timestamp
`selftext`	string	Post body (for text posts)
`comments`	object[]	Top comments (if includeComments enabled)

Output Example

{
  "title": "New JavaScript framework released",
  "author": "dev_user",
  "score": 1250,
  "url": "https://example.com/framework",
  "selftext": "Detailed writeup inside...",
  "subreddit": "programming",
  "createdUtc": 1712345678,
  "numComments": 342,
  "comments": [{"author": "...", "body": "..."}]
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~reddit-data-scraper/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "subreddits": ["programming", "technology"], "sort": "hot", "maxItems": 50, "includeComments": true }'

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/reddit-data-scraper").call(run_input={
  "subreddits": ["programming", "technology"],
  "sort": "hot",
  "maxItems": 50,
  "includeComments": true
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/reddit-data-scraper').call({
  "subreddits": ["programming", "technology"],
  "sort": "hot",
  "maxItems": 50,
  "includeComments": true
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

⚠️ Proxy Required on Apify Datacenter

Reddit blocks many shared datacenter IPs. Without proxy setup on Apify infra, runs can fail with runStatus: all_blocked and 0 posts.

To fix: enable Apify Residential proxy (APIFY_USE_APIFY_PROXY=true, APIFY_PROXY_GROUPS=RESIDENTIAL) or provide your own residential PROXY_URL.

Legacy Positioning

This actor is not the recommended first choice for new pack users.
Prefer reddit-all-in-one-scraper for research/backfill and reddit-keyword-monitor-alerts for recurring alerting.

FAQ

Is this the main Reddit Intelligence Pack actor?

No. This is the legacy fallback actor. New recurring monitor workflows should use reddit-keyword-monitor-alerts.

Does Reddit block this?

Yes, frequently on datacenter IPs. Residential proxy is typically required on Apify cloud.

What is runStatus in output?

Value	Meaning
`ok`	All subreddits fetched successfully
`partial`	Some subreddits succeeded; others were blocked/errored
`all_blocked`	Every subreddit was blocked — no posts collected (exit code 1)

Reddit Intelligence Pack (recommended path):

🚨 Reddit Keyword Monitor Alerts — Hero recurring monitor for net-new alerts.
📡 Reddit All-in-One Scraper — Research/backfill companion.
📰 Article Extractor — Linked URL cleanup add-on.
🐘 Mastodon Hashtag & Account Scraper — Federated social listening (Twitter/X-free), same query/result shape on the Fediverse.

Cost

Pay Per Event:

actor-start: $0.01 (flat fee per run)
dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

💾 Save it for later: click the bookmark icon at the top of the Apify Store page if you'd like to come back to it. Bookmarks help other engineers find this actor via Apify's discovery surfaces.

⭐ Was Reddit Posts & Subreddit Comment Scraper useful for your Reddit research?

If this actor saved you time, please leave a 5★ rating on Apify Store — it takes 10 seconds, helps other engineers and analysts discover it, and keeps updates free.

Have a feature request, bug, or sample workflow you'd like to share? Open an issue — we read every one and use them to prioritise the next release.

Reddit Scraper - Posts, Comments & Sentiment Data

renzomacar/reddit-scraper

Scrape Reddit subreddits, search results, and comment threads. Extract post titles, authors, scores, comment counts, body text, and full comment trees with nested replies. Sort by hot, new, top, or rising. Perfect for market research a -- By Renzo Madueno, https://rotatepilot.com/pilot-pay-2026

Renzo Madueno

311

Reddit Subreddit Scraper — Posts, Scores & Comment Counts

maged120/reddit-subreddit

Scrape posts from any Reddit subreddit. Get titles, scores, comment counts, authors, timestamps, and links. Supports hot, new, top, and rising sort orders.

Maged

Reddit Comment Scraper — Post Comments & Subreddit Monitoring

automly/reddit-comment-scraper

Extract comments from specific Reddit posts or from the top posts of any subreddit. Supports all Reddit comment sort modes. Residential proxy required for reliable access.

Automly

Reddit Comment Scraper

scrapelabsapi/reddit-comment-scraper

ScrapeLabs

Reddit Comment Scraper

scraperforge/reddit-comment-scraper

ScraperForge

Reddit Comment Scraper

scrapepilotapi/reddit-comment-scraper

ScrapePilot

Reddit Comment Scraper

scraperx/reddit-comment-scraper

ScraperX

Reddit Scraper V2 — Posts, Comments, Users & Subreddits (11)

red_crawler/reddit-scrape-v2

Scrape Reddit at scale: single posts, comment trees, user profiles, subreddit feeds, and detailed comment lookups (Get Comment by ID + Linked Comment Info). 11 endpoints, no Reddit account or proxy required. For bulk-by-ID lookups see the c

Red Crawler

5.0

Reddit Post & Comment Scraper

miccho27/reddit-post-scraper

Scrape Reddit posts and comments from any subreddit or thread URL. Extract titles, scores, authors, comment trees, and metadata. No Reddit API key or OAuth required.