💬 Subreddit & Comment Scraper avatar

💬 Subreddit & Comment Scraper

Pricing

Pay per event

Go to Apify Store
💬 Subreddit & Comment Scraper

💬 Subreddit & Comment Scraper

Scrape specific subreddits for top posts, historical discussions, and nested comments. Built with proxy-aware routing to bypass aggressive platform blocking.

Pricing

Pay per event

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

1

Bookmarked

6

Total users

2

Monthly active users

5 days ago

Last modified

Share

💬 Reddit Scraper (Legacy Fallback)

This actor is maintained as a legacy, proxy-sensitive fallback in the Reddit Intelligence Pack.

For new workflows:

  • Use reddit-keyword-monitor-alerts for recurring monitoring + net-new alerts
  • Use reddit-all-in-one-scraper for research/backfill collection

Use this actor only when you specifically need the older subreddit-focused flow.

Store Quickstart

Start with the legacy quickstart template (single subreddit). If running on Apify infrastructure, configure Residential proxy first.

Legacy Scope

  • Subreddit-based post scraping
  • Optional comment extraction
  • Basic sort/time controls
  • No recurring snapshot diff monitoring

Input

FieldTypeDefaultDescription
subredditsstring[](required)Subreddit names (max 20)
sortstringhothot, new, top, rising
maxItemsinteger25Max posts per subreddit (1-500)
includeCommentsbooleanfalseInclude nested comments

Input Example

{
"subreddits": ["programming", "technology"],
"sort": "hot",
"maxItems": 50,
"includeComments": true
}

Output

FieldTypeDescription
idstringReddit post ID
titlestringPost title
authorstringUsername of poster
subredditstringSubreddit name
urlstringPermalink to post
scoreintegerUpvote score
numCommentsintegerComment count
createdAtstringISO timestamp
selftextstringPost body (for text posts)
commentsobject[]Top comments (if includeComments enabled)

Output Example

{
"title": "New JavaScript framework released",
"author": "dev_user",
"score": 1250,
"url": "https://example.com/framework",
"selftext": "Detailed writeup inside...",
"subreddit": "programming",
"createdUtc": 1712345678,
"numComments": 342,
"comments": [{"author": "...", "body": "..."}]
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~reddit-data-scraper/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "subreddits": ["programming", "technology"], "sort": "hot", "maxItems": 50, "includeComments": true }'

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/reddit-data-scraper").call(run_input={
"subreddits": ["programming", "technology"],
"sort": "hot",
"maxItems": 50,
"includeComments": true
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/reddit-data-scraper').call({
"subreddits": ["programming", "technology"],
"sort": "hot",
"maxItems": 50,
"includeComments": true
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

⚠️ Proxy Required on Apify Datacenter

Reddit blocks many shared datacenter IPs. Without proxy setup on Apify infra, runs can fail with runStatus: all_blocked and 0 posts.

To fix: enable Apify Residential proxy (APIFY_USE_APIFY_PROXY=true, APIFY_PROXY_GROUPS=RESIDENTIAL) or provide your own residential PROXY_URL.

Legacy Positioning

  • This actor is not the recommended first choice for new pack users.
  • Prefer reddit-all-in-one-scraper for research/backfill and reddit-keyword-monitor-alerts for recurring alerting.

FAQ

Is this the main Reddit Intelligence Pack actor?

No. This is the legacy fallback actor. New recurring monitor workflows should use reddit-keyword-monitor-alerts.

Does Reddit block this?

Yes, frequently on datacenter IPs. Residential proxy is typically required on Apify cloud.

What is runStatus in output?

ValueMeaning
okAll subreddits fetched successfully
partialSome subreddits succeeded; others were blocked/errored
all_blockedEvery subreddit was blocked — no posts collected (exit code 1)

Reddit Intelligence Pack (recommended path):

Cost

Pay Per Event:

  • actor-start: $0.01 (flat fee per run)
  • dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.