Reddit Scraper — Posts, Comments & Subreddits [DEPRECATED] avatar

Reddit Scraper — Posts, Comments & Subreddits [DEPRECATED]

Deprecated

Pricing

$3.00 / 1,000 result scrapeds

Go to Apify Store
Reddit Scraper — Posts, Comments & Subreddits [DEPRECATED]

Reddit Scraper — Posts, Comments & Subreddits [DEPRECATED]

Deprecated

Scrape Reddit posts, comments, and subreddit data without API limits. Get titles, authors, scores, comments, timestamps, and metadata. Ideal for social listening and trend analysis. PPE pricing — pay only for results. [DEPRECATED — use reddit-scraper-fast]

Pricing

$3.00 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

Web Data Labs

Web Data Labs

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

0

Monthly active users

14 days ago

Last modified

Share

Reddit Scraper — Posts, Comments, Subreddits & User Data Without API Keys

Scrape Reddit posts, comments, subreddit feeds, and user profiles at scale — no API key, no Reddit Developer Console application, no OAuth flow. Returns structured JSON with titles, authors, scores, full bodies, comment threads, and timestamps.

Pay-per-result pricing means you only pay for items you actually receive. Output ships as JSON, CSV, or Excel.


Why Use This Instead of the Reddit API?

In June 2023, Reddit's API pricing changed dramatically. The free tier is now restricted, and commercial access starts at thousands of dollars per month. The big consumer apps (Apollo, RIF) shut down because of this. For independent developers, researchers, and growth teams, that pricing is unworkable.

This scraper takes a different approach. It works with Reddit's publicly visible web pages — the same pages anyone can view in a browser without logging in. No keys, no quotas, no application process.

CapabilityOfficial Reddit APIThis Scraper
CostEnterprise tier ($)Pay-per-result from $0.005
Setup timeOAuth app + approvalStart scraping in 30 seconds
API key requiredYesNo
Rate limitsStrict, can revoke accessBuilt-in retry handling
Subreddit feedsYesYes
SearchYesYes
CommentsYesYes
User profile historyYesYes

What Data You Get

Post records (subreddit, search, and user modes)

  • Post ID — Reddit's unique identifier
  • Title — full post title
  • Body / selftext — for text posts, the full body content
  • URL — for link posts, the linked URL; for self posts, the Reddit post URL
  • Permalink — direct Reddit URL to the post and its comment thread
  • Author — username of the poster
  • Subreddit — the community where the post lives
  • Score — net upvotes minus downvotes
  • Upvote ratio — share of upvotes vs total votes
  • Number of comments — comment count at scrape time
  • Created at — ISO timestamp of when the post was created
  • Post hintimage, video, link, self, gallery, etc.
  • Is video — boolean for video posts
  • NSFW flag — boolean
  • Stickied — boolean for pinned posts
  • Awards — list of awards on the post
  • Media URL — for image/video posts, the direct media URL

Comment records (comments mode)

  • Comment ID — Reddit's unique identifier
  • Body — full comment text (markdown)
  • Author — username of the commenter
  • Score — comment net score
  • Created at — ISO timestamp
  • Parent ID — ID of the parent comment (for thread reconstruction)
  • Depth — comment depth in the thread
  • Permalink — direct URL to the comment
  • Awards — list of awards on the comment

User profile records (user mode)

  • Username — the user's handle (without u/)
  • Karma — total post and comment karma
  • Account age — when the account was created
  • Is verified — Reddit's verified-account badge
  • Recent posts — list of the user's most recent posts
  • Recent comments — list of the user's most recent comments

Use Cases

1. Social listening and brand monitoring

Track mentions of your product, brand, or executives across all of Reddit. Schedule a daily search for your brand name, and get alerts when sentiment shifts. Reddit users are uniquely candid — Reddit conversations often surface product issues days before they appear on Twitter or in support tickets.

2. Market research and product validation

Pull posts from niche subreddits (r/buildapc, r/marketing, r/cscareerquestions) to understand what your target users complain about, what tools they recommend, and what problems they would pay to solve. Reddit threads are rich qualitative-research material.

3. Trend detection and viral monitoring

Scrape r/all, r/popular, or topic-specific subreddits sorted by rising or top/hour to spot emerging trends in real time. Marketers, journalists, and content creators use this to stay ahead of the news cycle.

4. Competitive intelligence

Monitor a competitor's name across Reddit. What do users actually say when nobody from the brand is in the room? Pull comments mentioning your competitor and run sentiment analysis. The pattern of complaints often points directly at the gaps you can win on.

5. Sentiment analysis and AI training data

Reddit provides one of the largest corpora of natural conversational text on the public web. Pull thousands of posts and comments on a topic, feed them into NLP pipelines, fine-tune sentiment models, or build domain-specific datasets for LLM training.

6. Lead generation

Find Reddit threads where users are explicitly asking for product recommendations in your category. "Can anyone recommend a CRM for a 5-person agency?" is a high-intent lead. Search for purchase-intent phrases across relevant subreddits and route the threads to your sales team.

7. Academic and policy research

Researchers studying online discourse, misinformation, political polarization, mental health communities, or cultural trends need large structured Reddit datasets. This scraper provides exportable, citation-ready data without the cost barriers of the official API.

8. Content curation and idea mining

Scrape the top-scoring posts from your niche this week to fuel newsletter content, social posts, or video scripts. The most upvoted Reddit content has already been validated by an audience — repurposing it is faster than starting from a blank page.


Modes

The actor supports four modes — choose one with the mode parameter:

ModeWhat it doesRequired inputs
subredditFetch posts from a specific subredditsubreddit, sort, maxResults
searchSearch across Reddit (or within a subreddit)query, subreddit (optional), sort, maxResults
commentsFetch comments from specific post URLspostUrls, maxResults
userFetch a user's posts and commentsusername, maxResults

Input Parameters

ParameterTypeRequiredDefaultDescription
modestringYessubredditOne of search, subreddit, comments, user
subredditstringConditionaltechnologySubreddit name (without r/) for subreddit and search modes
querystringConditionalSearch query for search mode
usernamestringConditionalReddit username for user mode
postUrlsarrayConditional[]List of Reddit post URLs for comments mode
sortstringNohotSort order — hot, new, top, or rising
timeFilterstringNoallFor top sort — hour, day, week, month, year, all
maxResultsintegerNo5Maximum number of items to return (1–1000)

Example — fetch hot posts from a subreddit

{
"mode": "subreddit",
"subreddit": "technology",
"sort": "hot",
"maxResults": 50
}

Example — search across Reddit

{
"mode": "search",
"query": "best ergonomic keyboard 2026",
"sort": "top",
"timeFilter": "month",
"maxResults": 100
}

Example — search within a specific subreddit

{
"mode": "search",
"subreddit": "buildapc",
"query": "rtx 5080 review",
"sort": "top",
"timeFilter": "week",
"maxResults": 25
}

Example — get comments from specific posts

{
"mode": "comments",
"postUrls": [
"https://www.reddit.com/r/technology/comments/abc123/example_post/",
"https://www.reddit.com/r/programming/comments/def456/another_post/"
],
"maxResults": 200
}

Example — pull a user's recent activity

{
"mode": "user",
"username": "spez",
"maxResults": 50
}

Output Examples

Post record

{
"id": "1abc23",
"title": "Show HN: We built an open-source Reddit scraper that doesn't need an API key",
"selftext": "Hey r/programming — we just shipped a scraper for Reddit's publicly visible pages...",
"url": "https://www.reddit.com/r/programming/comments/1abc23/show_hn_we_built/",
"permalink": "/r/programming/comments/1abc23/show_hn_we_built/",
"author": "exampleuser",
"subreddit": "programming",
"score": 4127,
"upvote_ratio": 0.94,
"num_comments": 218,
"created_at": "2026-05-04T19:32:11Z",
"post_hint": "self",
"is_video": false,
"nsfw": false,
"stickied": false,
"awards": ["Helpful", "Wholesome"]
}

Comment record

{
"id": "j2k3l4",
"body": "We've been using something similar internally — really nice to see an open implementation.",
"author": "anotheruser",
"score": 142,
"created_at": "2026-05-04T19:48:22Z",
"parent_id": "1abc23",
"depth": 0,
"permalink": "/r/programming/comments/1abc23/show_hn_we_built/j2k3l4/",
"awards": []
}

Datasets can be exported as JSON, CSV, XML, or Excel from the Apify console, or fetched programmatically via the dataset API.


Calling the Actor Programmatically

Python — apify-client

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
# Subreddit feed
run = client.actor("cryptosignals/reddit-scraper").call(run_input={
"mode": "subreddit",
"subreddit": "technology",
"sort": "top",
"timeFilter": "week",
"maxResults": 100,
})
for post in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"[{post['score']}] {post['title']} — r/{post['subreddit']}")

Python — search and analyze sentiment

from apify_client import ApifyClient
import pandas as pd
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("cryptosignals/reddit-scraper").call(run_input={
"mode": "search",
"query": "your-brand-name",
"sort": "new",
"maxResults": 200,
})
posts = list(client.dataset(run["defaultDatasetId"]).iterate_items())
df = pd.DataFrame(posts)
print(df.groupby("subreddit").size().sort_values(ascending=False))

Node.js — apify-client

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('cryptosignals/reddit-scraper').call({
mode: 'subreddit',
subreddit: 'cscareerquestions',
sort: 'rising',
maxResults: 50,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(p => console.log(`${p.score} | ${p.title}`));

cURL — direct API call

curl -X POST \
"https://api.apify.com/v2/acts/cryptosignals~reddit-scraper/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"mode": "search",
"query": "best home espresso machine",
"sort": "top",
"timeFilter": "year",
"maxResults": 100
}'

Pricing

This actor uses Pay-per-event pricing — you only pay for the items you actually receive. No charges for failed runs, empty searches, or compute time.

  • Cost: per-result pricing as listed on the actor's pricing page
  • Free tier: Apify's free plan includes $5/month of platform credits — enough to scrape thousands of posts
  • Cost examples:
Use caseItemsApproximate cost
Quick brand check25 posts$0.13
Daily brand monitoring100 posts/day$0.50/day
Weekly competitor sweep500 posts/week$2.50/week
Sentiment-analysis dataset5,000 posts$25
Research corpus50,000 posts$250

See Apify pricing →


FAQ

Scraping publicly visible data — pages anyone can view without logging in — is generally considered lawful in many jurisdictions. The U.S. Ninth Circuit's hiQ v. LinkedIn (2022) decision held that scraping publicly accessible data does not violate the Computer Fraud and Abuse Act. This actor only collects data from Reddit's publicly visible pages — no login, no private subreddits, no DMs, no quarantined content. Always consult your own legal counsel for your specific use case.

Do I need a Reddit account or API key?

No. The actor only accesses Reddit's publicly visible web pages — no login, no OAuth, no Reddit Developer Console application. That's why this actor exists.

What about Reddit's 2023 API changes?

Reddit's official API became prohibitively expensive in 2023. This scraper provides the same public data without the API access tier — a practical alternative for independent developers, researchers, and small teams.

How fast is it?

A run of 50 posts typically completes in 1–3 minutes. Comment-mode runs can take longer when threads are deep (5–10 minutes for 500+ comments). Multiple runs can execute in parallel using Apify's concurrent-run feature.

What if Reddit rate-limits me?

The actor has built-in retry logic with backoff. Apify residential proxies rotate IPs across thousands of geographies, which keeps requests well under sustainable limits. You generally don't need to think about rate limits.

Can I get private or quarantined subreddit data?

No. The actor only accesses publicly visible Reddit pages — no login, no private content, no quarantined or banned subreddits. That's by design.

Can I schedule recurring scrapes?

Yes. Apify's built-in scheduler runs any cron expression — hourly, daily, weekly. Pipe results to a webhook, Google Sheets, Airtable, Slack, or your own database. Combine with Apify's Zapier and Make integrations for downstream automation.

What output formats are supported?

JSON, CSV, Excel, and XML are all supported natively in the Apify console. You can also fetch results programmatically via the dataset API.

Will this work if Reddit changes its layout?

We maintain the actor and update it as Reddit evolves. If you notice missing data or unexpected output, open an issue on the actor page and we'll investigate promptly.


Why This Actor vs Alternatives?

  • No API key, no application process. Some competing actors require a Reddit Developer Console app and OAuth credentials. This one doesn't.
  • Pay-per-result. You only pay for posts/comments you receive. Failed runs cost zero.
  • Four modes in one actor. Most Reddit scrapers do one thing — subreddit feeds, or search, or comments. This actor does all four. One product, one billing line, one input schema.
  • Residential proxies built in. Anti-bot handling is automatic — you don't configure proxies.
  • Apify-native integrations. Pipe results into Zapier, Make, Google Sheets, Airtable, Slack, or any of Apify's 50+ integrations.
  • Maintained. The actor is updated when Reddit changes its layout.


About Web Data Labs

This actor is built and maintained by Web Data Labs — a team focused on production-grade web data extraction across jobs, e-commerce, social media, software reviews, and company data. We publish 100+ public actors on the Apify platform, all pay-per-result.

Need a custom build, enterprise SLA, or private actor for your team? Reach out via web-data-labs.com or the Apify contact form.