Reddit Scraper - Posts, Comments & Users avatar

Reddit Scraper - Posts, Comments & Users

Pricing

from $3.00 / 1,000 result scrapeds

Go to Apify Store
Reddit Scraper - Posts, Comments & Users

Reddit Scraper - Posts, Comments & Users

Extract posts, comments, communities & user profiles from any subreddit at scale. Fetches all comments including hidden/collapsed ones. Breaks Reddit's 1000-post limit with date windowing. No login needed, no browser. $0.003 per result. Supports search, sorting, NSFW filtering & date filtering.

Pricing

from $3.00 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

Better Devs Scrape

Better Devs Scrape

Maintained by Community

Actor stats

3

Bookmarked

704

Total users

322

Monthly active users

0.1 hours

Issues response

2 months ago

Last modified

Share

📡 Reddit Scraper — Extract Posts, Comments, Users & Subreddit Data

The most feature-rich Reddit scraper on Apify. Extract posts, comments, communities, and user profiles from any subreddit, search query, or URL — at just $0.003 per result. No Reddit account needed, no browser, pure HTTP.

A powerful Reddit API alternative for market research, sentiment analysis, brand monitoring, lead generation, and academic research. Paste URLs or type subreddit names, click Run, get clean structured JSON, CSV, or Excel.


🏆 Why Choose This Reddit Scraper?

FeatureThis ScraperOther Reddit Scrapers
💰 Price per result$0.003$0.004–$0.005+
Speed50 items in under 60 secondsMinutes with browser
💬 All comments✅ Fetches collapsed/hidden comments❌ Only top ~500
📅 Break 1000-post limit✅ Date windowing bypasses cap❌ Stuck at ~1000
🎯 Post filters✅ Score, flair, domain, author❌ None
👤 User profiles✅ Karma, account age, mod status❌ Posts only
🏘️ Community data✅ Members, rules, description❌ Not available
🛡️ Resilience✅ Auto-retry + fallback mirrors❌ Fails on 429/403
🧠 Memory256 MB (no browser)1 GB+ with browser
📱 Simple input✅ Just type subreddit names❌ Full URLs only
🔑 Login required❌ NoOften yes

🌐 What Is Reddit Scraping?

Reddit scraping is the process of automatically extracting data from Reddit — the world's largest forum with 1.7+ billion monthly visits. Instead of manually copying posts and comments, a Reddit scraper collects structured data (titles, text, scores, timestamps, user info) from subreddits, search results, and user profiles at scale.

Common reasons to scrape Reddit:

  • Market research — Track what customers say about your product or competitors
  • Sentiment analysis — Feed Reddit discussions into NLP pipelines for opinion mining
  • Content discovery — Find trending topics, viral posts, and emerging discussions
  • Lead generation — Identify users discussing problems your product solves
  • Brand monitoring — Get alerts when your brand is mentioned across subreddits
  • Academic research — Build datasets of online discussions for social science studies
  • SEO & content ideas — Discover what questions real people ask in any niche
  • AI training data — Collect diverse conversational data for language model fine-tuning

This scraper works as a Reddit API alternative — no OAuth tokens, no rate limit headaches, no developer application needed. Just paste URLs or type subreddit names and get data.


📊 What Data Can You Extract?

CategoryWhat You Get
📝 PostsTitle, body, score, comment count, media URLs, flair, awards, gallery images, video URLs
💬 CommentsFull comment trees including nested replies, scores, author flair, depth level
🏘️ CommunitiesSubreddit metadata, member counts, active users, description, rules, type
👤 UsersKarma breakdown, account age, premium status, moderator status, verification
🔍 SearchSearch across all of Reddit or within specific subreddits

⚙️ How to Scrape Reddit

1️⃣ Paste Reddit URLs (subreddits, posts, user profiles)
2️⃣ Set your limits (max posts, comments, items)
3️⃣ Click "Start" and get clean, structured data

The scraper automatically:

  • 💬 Fetches all comments — Including collapsed/hidden comments beyond Reddit's initial ~500
  • 🛡️ Resilient — Automatic retries with fallback mirrors when Reddit is unavailable
  • 🔄 Anti-blocking — Built-in session management to avoid rate limits
  • 📅 Breaks pagination limits — Date windowing gets posts beyond Reddit's ~1000 cap

📥 Input Example

{
"subreddits": ["technology", "programming"],
"searches": ["artificial intelligence"],
"minScore": 50,
"flairFilter": "Discussion",
"maxItems": 100,
"maxPostCount": 25,
"maxComments": 50,
"includeNSFW": false,
"proxy": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }
}

📋 Input Parameters

What to Scrape

ParameterTypeDefaultDescription
startUrlsarrayReddit URLs to scrape (posts, subreddits, users, leaderboard)
subredditsstring[]Subreddit names to scrape (without r/ prefix)
searchesstring[]Search queries to run on Reddit
searchCommunityNamestringRestrict search to a specific subreddit
searchPostsbooleantrueInclude posts in search results
searchCommentsbooleanfalseInclude comments in search results
searchCommunitiesbooleanfalseInclude subreddits in search results
searchUsersbooleanfalseInclude users in search results

Limits

ParameterTypeDefaultDescription
maxItemsinteger50Maximum total items saved to dataset
maxPostCountinteger25Max posts per subreddit or search
maxCommentsinteger10Max comments per post (0 to skip)
maxCommunitiesCountinteger2Max community listing pages
maxUserCountinteger2Max user listing pages
postsPerSourceinteger0Max posts per individual source (0 = no limit)

Sorting & Filtering

ParameterTypeDefaultDescription
sortstring"hot"Sort order: relevance, hot, top, new, rising, comments
timestring"all"Time filter: all, hour, day, week, month, year
commentSortstring"confidence"Comment sort: confidence (best), top, new, controversial, old, qa
includeNSFWbooleanfalseInclude NSFW content
postDateLimitstringOnly scrape posts after this date (YYYY-MM-DD)

Post Filters

ParameterTypeDefaultDescription
minScoreintegerOnly posts with at least N upvotes
maxScoreintegerOnly posts with at most N upvotes
minCommentsintegerOnly posts with at least N comments
maxCommentsFilterintegerOnly posts with at most N comments
flairFilterstringOnly posts matching this flair
domainFilterstringOnly posts from this domain
authorFilterstringOnly posts by this author
postsPerSourceinteger0Max posts per individual source

Skip Options

ParameterTypeDefaultDescription
skipCommentsbooleanfalseSkip comment extraction (faster)
skipUserPostsbooleanfalseSkip user's submitted posts
skipCommunitybooleanfalseSkip subreddit metadata

Advanced

ParameterTypeDefaultDescription
enableDateWindowingbooleanfalseBreak past Reddit's ~1000 post limit using date-range windows
proxyobjectResidentialProxy config (residential strongly recommended)
debugModebooleanfalseEnable verbose logging

📤 Sample Output

Post

{
"id": "t3_1rda27h",
"parsedId": "1rda27h",
"url": "https://www.reddit.com/r/AskReddit/comments/1rda27h/what_screams_i_am_deeply_insecure/",
"username": "curious_mind",
"title": "What screams \"I am deeply insecure\" but people do it thinking it makes them look cool?",
"communityName": "r/AskReddit",
"body": null,
"numberOfComments": 8432,
"upVotes": 24567,
"upVoteRatio": 0.94,
"isVideo": false,
"over18": false,
"createdAt": "2026-02-24T08:15:00.000Z",
"scrapedAt": "2026-02-24T15:58:42.000Z",
"flair": null,
"link": null,
"imageUrls": [],
"videoUrl": null,
"isGallery": false,
"stickied": false,
"locked": false,
"archived": false,
"spoiler": false,
"awardsCount": 12,
"dataType": "post"
}

Comment

{
"id": "t1_lm8x9y2",
"parsedId": "lm8x9y2",
"postId": "1rda27h",
"url": "https://www.reddit.com/r/AskReddit/comments/1rda27h/-/lm8x9y2/",
"parentId": "t3_1rda27h",
"username": "witty_replier",
"communityName": "r/AskReddit",
"body": "People who brag about how little sleep they get.",
"createdAt": "2026-02-24T09:30:00.000Z",
"scrapedAt": "2026-02-24T15:58:43.000Z",
"upVotes": 3421,
"numberOfReplies": 87,
"depth": 0,
"dataType": "comment"
}

Community

{
"id": "2qh1i",
"name": "t5_2qh1i",
"title": "Ask Reddit...",
"displayName": "AskReddit",
"numberOfMembers": 48000000,
"activeUserCount": 12500,
"subredditType": "public",
"description": "r/AskReddit is the place to ask and answer thought-provoking questions.",
"over18": false,
"createdAt": "2008-01-25T00:00:00.000Z",
"scrapedAt": "2026-02-24T15:58:32.000Z",
"url": "https://www.reddit.com/r/AskReddit/",
"dataType": "community"
}

User

{
"id": "1w72lch",
"userId": "t2_1w72lch",
"url": "https://www.reddit.com/user/spez/",
"username": "spez",
"totalKarma": 654321,
"postKarma": 123456,
"commentKarma": 530865,
"isGold": true,
"isMod": true,
"hasVerifiedEmail": true,
"createdAt": "2005-06-06T00:00:00.000Z",
"scrapedAt": "2026-02-24T15:49:43.000Z",
"dataType": "user"
}

💰 How Much Does It Cost to Scrape Reddit?

This Actor uses Pay-Per-Event pricing. You only pay for results actually saved.

EventCost
📄 Per result (post, comment, community, or user)$0.003

💵 Cost Examples

ScenarioItemsEstimated Cost
1 subreddit, 25 posts, 10 comments each~276~$0.83
10 subreddits, 50 posts each, no comments~510~$1.53
Search query, 100 posts, 50 comments each~5,100~$15.30
Single post with all comments (2000+)~2,001~$6.00

🎉 No login, no browser, no hidden costs. Pure HTTP scraping means low compute costs for you.


🔌 Integrate With Your Stack

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("betterdevsscrape/reddit-scraper").call(run_input={
"startUrls": [{"url": "https://www.reddit.com/r/technology/"}],
"maxItems": 100,
"maxPostCount": 25,
"maxComments": 20,
"proxy": {"useApifyProxy": True, "apifyProxyGroups": ["RESIDENTIAL"]},
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"[{item['dataType']}] {item.get('title') or item.get('body', '')[:80]}")

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('betterdevsscrape/reddit-scraper').call({
startUrls: [{ url: 'https://www.reddit.com/r/technology/' }],
maxItems: 100,
maxPostCount: 25,
maxComments: 20,
proxy: { useApifyProxy: true, apifyProxyGroups: ['RESIDENTIAL'] },
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`[${item.dataType}] ${item.title || item.body?.slice(0, 80)}`);
});

cURL

curl "https://api.apify.com/v2/acts/betterdevsscrape~reddit-scraper/runs" \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-d '{
"startUrls": [{"url": "https://www.reddit.com/r/technology/"}],
"maxItems": 50,
"proxy": {"useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"]}
}'

📦 Export Options

  • JSON — Full structured data
  • CSV — Spreadsheet-ready format
  • Excel — Direct download

🔄 Integrations

Connect via Zapier, Make (Integromat), or any platform that supports REST APIs.


🎯 Use Cases

📈 Market Research & Competitor Analysis

Monitor product mentions, competitor discussions, and industry trends across subreddits. Scrape r/technology, r/startups, or niche communities to understand what customers want. Use minScore filters to focus on high-engagement discussions only.

💼 Sentiment Analysis & Opinion Mining

Collect thousands of posts and comments about brands, products, or topics and feed them into NLP pipelines. The structured JSON output (with scores, timestamps, and nested comment trees) is ready for pandas, Spark, or any data processing tool.

🔍 Content Discovery & Trend Detection

Find trending content, popular posts, and viral discussions in any niche. Use sort: "rising" to catch content before it goes viral, or sort: "top" with time: "week" to see the best of the week.

📊 Academic & Social Science Research

Gather large datasets of Reddit discussions for social media studies, discourse analysis, or behavioral research. Date windowing lets you build historical archives going back years.

🎯 Lead Generation & Sales Intelligence

Find users actively discussing problems your product solves. Filter by flair, domain, or author to pinpoint high-intent conversations. Scrape user profiles to understand potential leads.

📢 Brand Monitoring & Reputation Management

Track mentions of your brand, products, or competitors across Reddit in real-time. Set up scheduled runs via Apify to get daily or weekly reports delivered to your inbox, Slack, or CRM.

📝 SEO Research & Content Ideas

Discover what questions real people ask in your industry. Reddit threads are a goldmine for blog topics, FAQ pages, and long-tail keyword research.

🤖 AI Training Data & LLM Fine-Tuning

Collect diverse conversational data from Reddit's thousands of communities. Posts and comment trees provide natural dialogue structure ideal for training chatbots, classifiers, and language models.


💡 Tips for Effective Scraping

🧪 Start Small

Begin with maxItems: 10 to test your configuration before scaling up.

🏠 Always Use Residential Proxies

Reddit blocks datacenter IPs aggressively. The scraper defaults to Apify residential proxies.

🔍 Use Search for Discovery

searches finds content across all of Reddit. startUrls is for specific pages you already know.

⚡ Skip Comments for Speed

Each comment counts as a result. Set maxComments: 0 or skipComments: true if you only need post data.

📅 Date Windowing for Archives

Enable enableDateWindowing with high maxPostCount to scrape older posts beyond Reddit's 1000-post pagination cap.

🔀 Sort Matters

Use "hot" for current popular content, "top" + time: "week" for the best of the week, "new" for chronological.


🧪 Example Configurations

Basic: Scrape one subreddit

{
"startUrls": [{ "url": "https://www.reddit.com/r/technology/" }],
"maxItems": 50
}

Search across all of Reddit

{
"searches": ["artificial intelligence"],
"maxItems": 100,
"maxPostCount": 50,
"sort": "relevance"
}

Scrape a specific post with all comments

{
"startUrls": [{ "url": "https://www.reddit.com/r/AskReddit/comments/abc123/some_post/" }],
"maxComments": 5000,
"maxItems": 10000
}

Scrape user profiles

{
"startUrls": [{ "url": "https://www.reddit.com/user/spez/" }],
"maxItems": 50
}

Filter by score and flair

{
"subreddits": ["programming", "javascript"],
"minScore": 100,
"flairFilter": "Discussion",
"maxItems": 50,
"skipComments": true
}
{
"subreddits": ["technology"],
"domainFilter": "youtube.com",
"minScore": 50,
"maxItems": 100
}

Track a specific author

{
"startUrls": [{ "url": "https://www.reddit.com/r/announcements/" }],
"authorFilter": "spez",
"maxItems": 50
}

Large-scale subreddit archive with date windowing

{
"startUrls": [{ "url": "https://www.reddit.com/r/technology/" }],
"maxPostCount": 5000,
"maxItems": 50000,
"enableDateWindowing": true,
"skipComments": true
}

❓ Frequently Asked Questions

Does this require a Reddit account or API key?

No. This scraper works without any Reddit account, OAuth token, or API key. It accesses only publicly visible content — no authentication needed.

Is this a Reddit API alternative?

Yes. Unlike the official Reddit API (which requires developer applications, OAuth, and has strict rate limits), this scraper extracts the same data with zero setup. Just provide URLs or subreddit names and run.

Why are residential proxies required?

Reddit actively blocks datacenter IP ranges. Residential proxies provide real IP addresses that Reddit doesn't block. The scraper defaults to Apify's residential proxy pool for maximum reliability.

Can I scrape private or quarantined subreddits?

No. Only public subreddits and posts are accessible. Quarantined subreddits require a logged-in session which this scraper does not use.

What happens when Reddit rate limits or blocks?

The scraper automatically retries failed requests with exponential backoff and falls back to mirror sites when Reddit is temporarily unavailable. Sessions are rotated to avoid IP bans.

How do I scrape more than 1000 posts from a subreddit?

Enable enableDateWindowing in the input. This splits the request into weekly date-range search queries that bypass Reddit's ~1000 post pagination limit. You can scrape entire subreddit archives this way.

How do I get all comments on a post, including hidden ones?

Just set maxComments to a high number. The scraper automatically detects collapsed/hidden comment threads (Reddit's "load more comments") and fetches them via the morechildren API. No extra configuration needed.

Can I export Reddit data as CSV or Excel?

Yes. Apify datasets support JSON, CSV, Excel, XML, and RSS exports. You can also connect via API, Zapier, Make (Integromat), or Google Sheets.

Can I schedule Reddit scraping on a recurring basis?

Yes. Use Apify's built-in scheduler to run the scraper hourly, daily, or weekly. Combine with webhooks or integrations to get results delivered to Slack, email, Google Sheets, or your own API.

How fast is it?

50 items in under 60 seconds. Pure HTTP scraping with no browser means low memory usage (~256 MB) and fast execution. Most runs complete in under 2 minutes.

What output format does the data come in?

Each item has a dataType field ("post", "comment", "community", or "user") so you can easily filter. All items include timestamps, IDs, and URLs for traceability.


Web scraping publicly available data is generally legal. This Actor extracts only publicly visible content from Reddit.

Users are responsible for:

  • ✅ Complying with applicable data protection laws (GDPR, CCPA, etc.)
  • ✅ Respecting Reddit's terms of service
  • ✅ Using extracted data ethically and legally

If you're unsure whether your use case is legitimate, consult your lawyers. You can also read Apify's blog post on the legality of web scraping.


💬 Support & Feedback

  • 💡 Feature requests? We'd love to hear what you need
  • Questions? Check the FAQ above or reach out
  • Happy with the results? Leave us a review on the Apify Store!

🔗 More Scrapers by BetterDevsScrape

ScraperDescription
📍 Google Maps ScraperExtract businesses, reviews, images, contacts & emails from Google Maps
📇 Contact Details ExtractorExtract emails, phone numbers & 25+ social media profiles from any website

Built with ❤️ by BetterDevsScrape | View on Apify Store