Pricing

from $1.20 / 1,000 results

Reddit Scraper

Scrape Reddit posts, comments, search results, and user profiles. No API keys or browser needed. Supports 4 modes: subreddit posts (hot/new/top/rising), Reddit search, user profiles, and full comment trees. Fast, lightweight HTTP-based scraping with built-in rate limiting and retry logic.

Pricing

from $1.20 / 1,000 results

Rating

0.0

(0)

Developer

mick_

Actor stats

Bookmarked

150

Total users

Monthly active users

16 hours ago

Last modified

What does it do?

Reddit Scraper pulls structured data from old.reddit.com — no OAuth, no Reddit API credentials. You get clean, consistent JSON output ready for analysis, NLP pipelines, or downstream AI tools.

v1.2.0: Reddit shut down its public .json API (returns 403 since May 2026). This actor now parses Reddit's server-rendered HTML instead, so it keeps working where .json-based scrapers broke. Output stays the same. Also added a fail-fast health check and faster request pacing.

v1.1.0: Added batch search (searchQueriesList) — run multiple queries in a single job with automatic deduplication by post ID.

👥 Who Uses This

🏢 Brand and Market Researchers

You need to know what real people say about your product, competitors, or industry — not curated press releases, but unfiltered community discussion. Reddit is where honest opinions live. This actor lets you monitor multiple brand terms or competitor names in one run, deduplicated and ready for sentiment analysis.

Typical input:

{
    "mode": "search",
    "searchQueriesList": ["YourBrand review", "CompetitorA vs CompetitorB", "best CRM 2025"],
    "searchSort": "top",
    "timeFilter": "year",
    "maxResults": 500,
    "includeComments": true
}

Run this on a schedule (daily or weekly via Apify schedules) to track brand sentiment shifts over time without touching the Reddit website.

💻 NLP and ML Engineers

You need topic-specific text at scale — Reddit comments and posts for training classifiers, fine-tuning embeddings, building sentiment models, or labeling datasets. The structured output (author, score, depth, timestamp) gives you signal for quality filtering without post-processing.

Collect training data from multiple subreddits:

{
    "mode": "subreddit_posts",
    "subreddits": ["MachineLearning", "LocalLLaMA", "datascience", "learnmachinelearning"],
    "sort": "top",
    "timeFilter": "year",
    "maxResults": 2000,
    "includeComments": true
}

Filter by score (high-upvote posts = community-validated content) and depth (top-level comments = more coherent standalone text). The userContentType field on user profile mode lets you pull comment-only output for dialogue dataset construction.

🛠️ Product Teams and Startups

You want to understand what problems your target market is describing in their own words — not survey responses, but organic complaints, feature requests, and workaround threads. Reddit search across the right subreddits is a fast way to do Jobs-to-Be-Done research before writing a single line of code.

Discovery research across communities:

{
    "mode": "search",
    "searchQueriesList": ["wish there was a tool for", "looking for software that", "does anyone know how to automate"],
    "searchSubreddit": "entrepreneur",
    "searchSort": "relevance",
    "maxResults": 200
}

Use batch search to sweep multiple pain-point queries across a single subreddit or across all of Reddit. Export to CSV for tagging and clustering in a spreadsheet.

You're tracking narratives, investigating communities, or mapping how opinions shift around a topic over time. Reddit's threaded comment structure and upvote system give you signal on consensus and dissent that flat social feeds don't provide.

Pull full comment trees from key posts:

{
    "mode": "post_comments",
    "postUrls": [
        "https://www.reddit.com/r/politics/comments/abc123/some_breaking_story/",
        "https://www.reddit.com/r/technology/comments/def456/another_post/"
    ],
    "maxCommentsPerPost": 1000
}

Use user_profile mode to audit a specific account's post and comment history across subreddits — useful for investigating astroturfing, coordinated behavior, or tracking how a public figure's community engagement evolves.

🤖 AI/LLM Engineers and Agent Builders

You're building AI pipelines that need real-time access to community knowledge — RAG systems grounded in current Reddit discussions, agents that can search subreddits on demand, or workflows that pull fresh posts into an LLM context window.

MCP tool config for Claude Desktop / Cursor:

{
    "mcpServers": {
        "reddit-scraper": {
            "url": "https://mcp.apify.com?tools=labrat011/reddit-scraper",
            "headers": {
                "Authorization": "Bearer <APIFY_TOKEN>"
            }
        }
    }
}

Once configured, your AI agent can call reddit-scraper as a tool to search any subreddit, pull comment threads, or monitor user activity — no infrastructure to manage. Combine with other actors in the healthcare or finance cluster for multi-source research pipelines.

Features

4 scraping modes: subreddit posts, Reddit search, user profiles, post comments
Batch search: run multiple search queries in a single job — results merged and deduplicated by post ID
Multi-target: subreddits, usernames, and post URLs all accept lists — scrape many at once
Sort and filter: hot, new, top (with configurable time range), rising
Full comment trees: recursive extraction with depth tracking
Search scope: across all of Reddit or restricted to a single subreddit
User profiles: posts only, comments only, or both
Pagination: automatic page-following up to Reddit's ~1,000-item limit
Browser-grade requests: Playwright with Chrome TLS impersonation + rotating residential IPs to avoid blocks
28 output fields per post — including upvote ratio, author flair, content type hints, edit timestamps, and crosspost detection
Retry logic: exponential backoff on 429, IP rotation on 403
Fail-fast health check: a run that scrapes 0 results fails loudly instead of silently billing compute
State persistence: survives Apify actor migrations mid-run

Scraping modes

Mode 1: Subreddit Posts

Scrape posts from one or more subreddits.

{
    "mode": "subreddit_posts",
    "subreddits": ["python", "machinelearning", "webdev"],
    "sort": "top",
    "timeFilter": "month",
    "maxResults": 200
}

Sort options: hot, new, top, rising. timeFilter applies only when sort is top: hour, day, week, month, year, all.

Mode 2: Search Reddit

Search across all of Reddit or within a specific subreddit. Use searchQueriesList to run multiple queries in one job.

Single query:

{
    "mode": "search",
    "searchQuery": "best python web framework 2025",
    "searchSort": "relevance",
    "maxResults": 100
}

Batch search (v1.1.0):

{
    "mode": "search",
    "searchQueriesList": ["ChatGPT vs Claude", "best LLM 2025", "AI coding assistant"],
    "searchSort": "top",
    "timeFilter": "year",
    "maxResults": 300
}

Results across all queries are merged and deduplicated by post ID. searchQueriesList overrides searchQuery when provided.

Restricted to a subreddit:

{
    "mode": "search",
    "searchQuery": "fastapi vs django",
    "searchSubreddit": "python",
    "searchSort": "top",
    "maxResults": 50
}

Search sort options: relevance, hot, top, new, comments.

Mode 3: User Profile

Scrape posts and/or comments from Reddit user profiles.

{
    "mode": "user_profile",
    "usernames": ["user1", "user2"],
    "userContentType": "overview",
    "maxResults": 200
}

Content type options: overview (posts + comments), submitted (posts only), comments (comments only).

Mode 4: Post Comments

Extract the full comment tree from specific Reddit posts.

{
    "mode": "post_comments",
    "postUrls": [
        "https://www.reddit.com/r/Python/comments/1r19hu1/after_25_years_using_orms_i_switched_to_raw/",
        "https://www.reddit.com/r/machinelearning/comments/abc123/some_post/"
    ],
    "maxCommentsPerPost": 500
}

Input parameters

Parameter	Type	Default	Description
`mode`	string	`subreddit_posts`	Scraping mode: `subreddit_posts`, `search`, `user_profile`, `post_comments`
`subreddits`	string[]	—	Subreddit names (without r/ prefix). Mode: subreddit_posts
`sort`	string	`hot`	Sort order: `hot`, `new`, `top`, `rising`
`timeFilter`	string	`week`	Time range for Top sort: `hour`, `day`, `week`, `month`, `year`, `all`
`searchQuery`	string	—	Single search term. Mode: search
`searchQueriesList`	string[]	`[]`	Multiple search queries — merged and deduplicated. Overrides `searchQuery`. Mode: search
`searchSubreddit`	string	—	Restrict search to one subreddit. Leave empty for all of Reddit
`searchSort`	string	`relevance`	Search sort: `relevance`, `hot`, `top`, `new`, `comments`
`usernames`	string[]	—	Reddit usernames (without u/ prefix). Mode: user_profile
`userContentType`	string	`overview`	`overview` (posts+comments), `submitted`, `comments`
`postUrls`	string[]	—	Full Reddit post URLs. Mode: post_comments
`maxCommentsPerPost`	integer	`100`	Max comments per post. `0` = no limit
`maxResults`	integer	`100`	Max total results (1–10,000). Free tier: 25 per run
`includeComments`	boolean	`false`	Also fetch comments for each post in subreddit/search mode. Slower, higher proxy cost
`proxyConfiguration`	object	Residential	Proxy settings. Residential proxies required

Output

Results are saved to the default dataset. Download as JSON, CSV, Excel, or XML from the Output tab.

Post fields

Field	Type	Description
`type`	string	Always `"post"`
`id`	string	Reddit post ID
`subreddit`	string	Subreddit name
`title`	string	Post title
`author`	string	Author username
`selftext`	string	Post body text (empty for link posts)
`url`	string	Reddit permalink
`externalUrl`	string	Linked URL (for link posts)
`score`	integer	Net upvotes
`numComments`	integer	Total comment count
`created`	string	ISO 8601 UTC timestamp
`isNSFW`	boolean	NSFW flag
`isSpoiler`	boolean	Spoiler flag
`isPinned`	boolean	Stickied/pinned flag
`flair`	string	Post flair text
`awards`	integer	Award (gilding) count
`domain`	string	Link domain (e.g. `self.python`)
`isVideo`	boolean	Video post flag
`thumbnail`	string	Thumbnail URL (empty for self/text posts)
`isPromoted`	boolean	Whether the post is a promoted ad
`upvoteRatio`	number	Upvote ratio (0–1), community consensus signal
`edited`	timestamp/false	Unix timestamp of last edit, or `false` if never edited
`postHint`	string	Post type hint: `link`, `self`, `image`, `video`, `rich:video`
`isOriginalContent`	boolean	Original content (OC) flag
`authorFlair`	string	Author's subreddit flair text
`crosspostParent`	string	Parent post ID if crosspost (`t3_xxx`)
`mediaOnly`	boolean	Media-only post with no text body
`isGallery`	boolean	Reddit gallery post

Comment fields

Field	Type	Description
`type`	string	Always `"comment"`
`id`	string	Comment ID
`postId`	string	Parent post ID
`subreddit`	string	Subreddit name
`author`	string	Author username
`body`	string	Comment text
`score`	integer	Net upvotes
`created`	string	ISO 8601 UTC timestamp
`depth`	integer	Nesting depth (0 = top-level)
`isSubmitter`	boolean	Whether author is the post's OP
`awards`	integer	Award (gilding) count
`url`	string	Reddit permalink
`edited`	timestamp/false	Unix timestamp of last edit, or `false` if never edited

Example output

{
    "type": "post",
    "id": "1r19hu1",
    "subreddit": "Python",
    "title": "After 25 years using ORMs, I switched to raw SQL",
    "author": "example_user",
    "selftext": "Here's what I learned after making the switch...",
    "url": "https://www.reddit.com/r/Python/comments/1r19hu1/...",
    "externalUrl": "",
    "score": 1842,
    "numComments": 312,
    "created": "2025-03-01T09:14:22+00:00",
    "isNSFW": false,
    "isSpoiler": false,
    "isPinned": false,
    "flair": "Discussion",
    "awards": 5,
    "domain": "self.Python",
    "isVideo": false,
    "thumbnail": "",
    "isPromoted": false,
    "upvoteRatio": 0.89,
    "postHint": "self",
    "isOriginalContent": true,
    "authorFlair": "Expert",
    "mediaOnly": false,
    "isGallery": false
}

Cost

This actor uses pay-per-event (PPE) pricing — you pay only for results you get.

Charged per dataset item pushed (default Apify PPE event)
Proxy traffic is billed separately (residential proxies run ~$12.50/GB on Apify)
Typical cost: $0.50–$1.00 per 1,000 results depending on proxy usage and whether comments are included
Free tier: 25 results per run — no subscription required
Paid tier: up to 10,000 results per run

Worked pricing example: Searching 3 subreddits for "python framework", sorting by top of the month, returning 100 results:

~4-8 requests × 25 items each
~$0.15–0.20 in event charges (100 items × $1.50/1k + $0.01 run start)
~$0.01–0.03 in residential proxy traffic
Total: ~$0.16–0.23 per run

Each listing page returns ~25 posts, and requests are paced at roughly 1 per second over rotating residential IPs. A 100-post subreddit run takes well under a minute. Enabling includeComments adds one request per post.

MCP Integration

This actor works as an MCP tool via Apify's hosted MCP server. No custom server needed — AI agents can call it directly.

Endpoint: https://mcp.apify.com?tools=labrat011/reddit-scraper
Auth: Authorization: Bearer <APIFY_TOKEN>
Transport: Streamable HTTP
Works with: Claude Desktop, Cursor, VS Code, Windsurf, Warp, Gemini CLI

Claude Desktop / Cursor config:

{
    "mcpServers": {
        "reddit-scraper": {
            "url": "https://mcp.apify.com?tools=labrat011/reddit-scraper",
            "headers": {
                "Authorization": "Bearer <APIFY_TOKEN>"
            }
        }
    }
}

AI agents can search Reddit for discussions, scrape subreddit posts, pull comment threads, and monitor user activity — all as a callable tool without managing any infrastructure.

Technical details

Parses old.reddit.com server-rendered HTML — no API credentials, no OAuth. (Reddit's .json API now returns 403; this actor does not depend on it.)
Requests use Chrome TLS impersonation via curl_cffi to pass Reddit's bot fingerprinting
Paced at ~1 request/second with jitter over rotating residential IPs
Exponential backoff on 429 (5s base, doubles per retry); IP rotation on 403
Pagination by following the listing's next-page link (up to ~1,000 items per listing)
Results pushed in batches of 25 for memory efficiency
Fail-fast health check: a run that yields 0 results fails with a clear message
Actor state persisted across Apify platform migrations

Limitations

Reddit caps listing pagination at roughly 1,000 items per subreddit/user endpoint
"Load more comments" nodes in deep comment trees are not expanded — only the initially loaded tree (up to 500 comments/post) is extracted

FAQ

Is it legal to scrape Reddit?

Web scraping of publicly available data is generally legal, as established by the hiQ Labs v. LinkedIn ruling. This actor only accesses public Reddit content visible to any anonymous browser visitor. It does not bypass login walls, CAPTCHAs, or access private content.

Why are residential proxies required?

Reddit blocks nearly all datacenter IP ranges. Residential proxies route requests through real ISP connections that Reddit does not filter. Without them, most requests will return 403s.

How does batch search work?

Set searchQueriesList to an array of query strings. The actor runs each query sequentially and merges results into a single dataset, removing duplicate posts (matched by Reddit post ID). This is useful for brand monitoring (track multiple product names in one run), competitive research, or collecting data across related topics.

Can I use this with the Apify API?

Yes. Call the actor via the Apify REST API and poll for results, or use the Apify Python or JavaScript client libraries. Results are available in JSON, CSV, Excel, and XML formats.

What happens if a subreddit, user, or post URL doesn't exist?

The scraper logs a warning and skips the invalid target. All remaining valid targets in the same run continue as normal.

Why This Scraper vs Alternatives

Feature	Reddit Scraper (labrat011)	trudax/reddit-scraper	trudax/reddit-scraper-lite	harshmaur/reddit-scraper
Price per 1k	$1.20	~$4 + $45/mo sub	$3.40	$1.50–2.00
Rating	4.7 ⭐	2.4 ⚠️	4.6	5.0
Runs	1.9K	14K	30K	5.5K
Works after May '26 .json die-off	✅ Playwright warm-up	❌ Not updated	❌ Not updated	❌ Not updated
Need Reddit API key / OAuth	No	No	No	No
Residential proxies	✅ Required	Required	Required	Required
Batch search (multi-query)	✅ Yes	—	—	—
Free tier	✅ 25 results/run	❌	❌	✅ Limited

Key advantages: After Reddit shut down its public .json API in May 2026, this actor was updated to use Playwright-based browser warm-up to solve Cloudflare challenges — competitors that still depend on the old .json endpoints now return 403s. At $1.20/1k, it's the cheapest working Reddit scraper on the Store.

Actor	What it does	Pairs well with Reddit Scraper when...
Academic Paper Scraper	Google Scholar, Semantic Scholar, arXiv	You find a paper discussed on Reddit and want the full metadata and abstract
PubMed Scraper	35M+ biomedical abstracts from NCBI	r/science or health subreddit posts reference medical studies you want to retrieve
Clinical Trials Scraper	ClinicalTrials.gov study data	Reddit health communities discuss ongoing trials you want to track
LinkedIn Jobs Scraper	Job postings and company data	You monitor r/cscareerquestions or industry subreddits and want matching job listings
NPI Provider Contact Finder	Healthcare provider directory	Health subreddit discussions lead to provider lookup needs

Feedback

Found a bug or have a feature request? Open an issue on the Issues tab in Apify Console.

Reddit Scraper CHEAP — Posts, Comments, Users & Subreddits

ahmed_jasarevic/reddit-scraper-pro

The most powerful unofficial Reddit scraper. Extract posts, comments, subreddits, and user profiles at scale — no login required. Supports keyword search, all sort modes (hot/new/top), automatic pagination, and media extraction.

Ahmed Jasarevic

5.0

Reddit Scraper

prodiger/reddit-scraper

Extract posts, comments, user profiles, and search results from Reddit. Pure HTTP, no API key required.

Arnas

172

Reddit User Scraper - Profiles, Karma & Post History ($1.5/1k)

harshmaur/reddit-user-scraper

Scrape any Reddit user's profile from a username or URL — karma, account age, and full post and comment history. Built for audience research, moderation vetting, and OSINT on public data. No API key, no login. Export to CSV, Excel, or JSON. From $1.50 per 1,000 results.

Harsh Maur

5.0

Reddit Scraper - Posts, Comments, Subreddits, Search

thirdwatch/reddit-scraper

Scrape Reddit posts, comments, and subreddits. Search globally or within specific subreddits. Get post title, body, score, comments, author, flair, awards, and media URLs. Ultra-fast HTTP-only scraper using Reddit's built-in JSON API.

Thirdwatch

203

5.0

Subreddit Scraper - Whole Subreddits, No 1k Post Cap ($1.5/1k)

harshmaur/reddit-subreddit-scraper

Scrape entire subreddits — thousands of posts per community, far beyond Reddit's ~1,000-post listing cap, by combining every sort and time window. Optional comments per post. Archive communities or build ML datasets. No API key. CSV/Excel/JSON. From $1.50 per 1,000 posts.

Harsh Maur

5.0

Reddit Comment Scraper - Export Comments & Replies ($1.5/1k)

harshmaur/reddit-comments-scraper

Scrape Reddit comments without the API — every comment and nested reply from any post URL, 'load more comments' expanded automatically. Export to CSV, Excel, or JSON for sentiment analysis, AI training data, or research. No login, no rate limits. From $1.50 per 1,000 comments.

Harsh Maur

5.0

👽 Reddit Scraper — Posts, Comments & Search

inexhaustible_glass/reddit-scraper

Scrape Reddit posts, comments, search results & user activity. No login, no API key. Subreddit hot/new/top/rising, keyword search, full comment trees. Auto-paginated. For market research, lead monitoring, brand sentiment & content ideas.

Hitman studio

Reddit API Scraper

comchat/reddit-api-scraper

Reddit Scraper is a powerful tool that allows you to extract data from Reddit such as posts by keyword. With Reddit Scraper, you can easily gather valuable information from Reddit without the need to log in. You can easily use this Reddit scraper as an alternative API.

Comchat

3.2

Github User Profile Scraper

powerful_bachelor/Github-User-Profile-Scraper

The GitHub User Profile Scraper extracts vital info from GitHub profiles, including followers, following, LinkedIn, Twitter, achievements and much more. Ideal for developers, researchers, and marketers, it supports multiple profiles and exports data in various formats.