Pricing

from $0.50 / 1,000 results

Try for free

Go to Apify Store

Reddit [Only $0.50💰] Posts | Users | Scraper

Try for free

[Only $0.50💰] Scrape Reddit posts (and optionally full comment threads with replies) from public JSON — site-wide & subreddit search, feed browse, post/comment date filters, NSFW toggle, strict phrase/token match, proxy rotation. Legacy: deduplicated authors from subreddit listing URLs

Pricing

from $0.50 / 1,000 results

Rating

5.0

(4)

Developer

Muhamed Didovic

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Reddit Scraper

Scrape Reddit posts (and optionally full comment threads with replies) from public JSON — site-wide & subreddit search, feed browse, post/comment date filters, NSFW toggle, strict phrase/token match, proxy rotation. Legacy: deduplicated authors from subreddit listing URLs (key-value store).

How it works

How Reddit Scraper works

Why Use This Scraper?

Multiple starting points — site-wide keywords, one subreddit (search or feed browse), or legacy subreddit URLs for author collection.
Post-shaped rows — search modes write one dataset item per post, with Reddit’s own nested preview / media when present.
Optional comments — separate dataset rows per comment, with configurable caps and morechildren depth.
Practical filters — NSFW toggle, post/comment date windows, strict phrase and token filters, “maximize coverage” budgets, residential-friendly proxy support.
Export-friendly — default Apify Dataset; download as JSON, CSV, Excel (CSV flattens nested keys with dots).

Overview

This actor is for research, monitoring, content analysis pipelines, and internal tools that need structured Reddit data without maintaining your own scraping stack.

Search all of Reddit and Search one subreddit resolve into post rows in the dataset. Enable comments to add comment rows in the same dataset (dataType: "comment", ids like t1_…).
User scraper (legacy) targets deduplicated usernames stored under the key-value key API_DETAILS. The dataset is secondary in that mode.

The actor does not compute sentiment or topic categories; related input flags are ignored.

Supported Inputs

You provide	When	Notes
`mode` = `searchGlobal`	Site-wide search	Set `searchQueries` (string list). `searchSort`, `searchTimeframe`, `includeOver18` apply.
`mode` = `searchSubreddit`	One subreddit	`subredditSearchUrl` (`https://…/r/name`, `r/name`, or bare name). Empty `subredditSearchQueries` → feed listing; non-empty → in-subreddit search.
`mode` = `subredditUsers`	Legacy authors	`startUrls` — subreddit home URLs (e.g. `https://www.reddit.com/r/webscraping`).
`maxItems`	All modes	Search: max posts (split across keyword lines). Legacy: max unique authors.
`proxy`	All modes	Use residential or quality proxies if Reddit serves block pages or throttles.

Search options in the Console schema also include: searchIncludeComments, searchMaxCommentsPerPost, searchListingLimitPerPage, searchPostDateFrom / searchPostDateTo, searchCommentDateFrom / searchCommentDateTo, searchForceNewSortWhenDateFiltered, searchStrictPhrase, searchStrictTokenFilter, searchMaximizeCoverage, maxConcurrency / minConcurrency / maxRequestRetries (legacy crawler), etc.

Advanced JSON-only: searchHttpTransport (internalParallel | internal), searchParallelQueryConcurrency. Legacy aliases still read in code: queries, maxPosts, includeNsfw, scrapeComments, maxComments, dateFrom / dateTo, forceSortNewForTimeFilteredRuns, strictSearch, strictTokenFilter, maximize_coverage.

Not supported: logged-in sessions, guaranteed access to private communities, compliance with your jurisdiction’s rules for Reddit data (your responsibility).

Use Cases

Audience	Typical use
Researchers	Post and comment samples for NLP or social science.
Marketing & brand	Mention tracking, campaign feedback, subreddit tone.
Agencies	Client-ready Reddit exports on a schedule.
Product & data teams	Dashboards fed from Dataset webhooks.
Developers	Baseline Reddit HTTP + parsing without owning infra.

How It Works

Choose mode and fill only the inputs that apply (Console sections match this).
The actor requests search.json, subreddit listing or search URLs on old.reddit, or (legacy) listing JSON on www.reddit.com.
Each accepted post is mapped to a flat row (legacy Reddit field names plus optional merged aliases such as dataType, scrapedAt, thread_url).
If searchIncludeComments is on, the actor loads thread JSON, walks replies, and may call /api/morechildren until per-post limits and round budgets are reached.
You export the Dataset (and for legacy mode, read API_DETAILS from the Key-value store).

Input Configuration

Search all of Reddit (minimal):

{
    "mode": "searchGlobal",
    "searchHttpTransport": "internalParallel",
    "searchQueries": ["Cheesecake"],
    "searchSort": "relevance",
    "searchTimeframe": "all",
    "includeOver18": false,
    "maxItems": 100,
    "searchIncludeComments": false,
    "proxy": { "useApifyProxy": true }
}

Search one subreddit:

{
    "mode": "searchSubreddit",
    "searchHttpTransport": "internalParallel",
    "subredditSearchUrl": "https://www.reddit.com/r/technology",
    "subredditSearchQueries": ["api"],
    "subredditSearchSort": "relevance",
    "subredditSearchTimeframe": "all",
    "maxItems": 50,
    "proxy": { "useApifyProxy": true }
}

User scraper (legacy) — author list:

{
    "mode": "subredditUsers",
    "startUrls": [{ "url": "https://www.reddit.com/r/webscraping" }],
    "maxItems": 500,
    "maxConcurrency": 5,
    "proxy": { "useApifyProxy": true }
}

Output Overview

Dataset — one JSON object per line item. Search runs are mostly kind: "post" rows. Comments, when enabled, are additional items with dataType: "comment".
Legacy user mode — primary output is API_DETAILS (unique usernames) in the default key-value store, not the same post schema.
Downloads — Apify offers JSON, CSV, Excel, etc. Nested objects (e.g. preview.images) flatten in CSV.
Honest variability — Reddit omits or nulls fields by post type (text vs link vs gallery, removed authors, ads). Keys are stable when the mapper adds them; values are not.

Output Samples

Shortened post object (real shape; preview…resolutions trimmed for readability). The first record in repo data.json shows the full resolutions ladder.

{
    "kind": "post",
    "query": "Cheesecake",
    "id": "futnih",
    "title": "Different kinds of cheesecake",
    "body": "",
    "author": "cttrv",
    "score": 43222,
    "upvote_ratio": 0.95,
    "num_comments": 1204,
    "subreddit": "coolguides",
    "created_utc": "2020-04-04T13:23:45.000Z",
    "url": "https://i.redd.it/tczgv8mgysq41.jpg",
    "permalink": "/r/coolguides/comments/futnih/different_kinds_of_cheesecake/",
    "canonical_url": "https://www.reddit.com/r/coolguides/comments/futnih/different_kinds_of_cheesecake/",
    "old_reddit_url": "https://old.reddit.com/r/coolguides/comments/futnih/different_kinds_of_cheesecake/",
    "flair": null,
    "post_hint": "image",
    "over_18": false,
    "is_self": false,
    "spoiler": false,
    "locked": false,
    "is_video": false,
    "domain": "i.redd.it",
    "thumbnail": "https://b.thumbs.redditmedia.com/BjwCwDT6OG40X5VFEYWsYSUJ_lrLZvUZizm4q_WG8hk.jpg",
    "subreddit_id": "t5_310rm",
    "subreddit_name_prefixed": "r/coolguides",
    "subreddit_subscribers": 6028638,
    "preview": {
        "images": [
            {
                "source": {
                    "url": "https://preview.redd.it/tczgv8mgysq41.jpg?auto=webp&s=7b17fbc8e9050ee4b242f7ece7de63ae7b0ee43b",
                    "width": 1200,
                    "height": 1200
                },
                "resolutions": [
                    {
                        "url": "https://preview.redd.it/tczgv8mgysq41.jpg?width=108&crop=smart&auto=webp&s=f1107cc1f996a0013ad8b321c6453442d5f576ea",
                        "width": 108,
                        "height": 108
                    },
                    {
                        "url": "https://preview.redd.it/tczgv8mgysq41.jpg?width=216&crop=smart&auto=webp&s=4f45cf02c049215a7462669b4e25268aba62ff49",
                        "width": 216,
                        "height": 216
                    }
                ],
                "variants": {},
                "id": "FJE3K_tyrkJobmnqvZME7_YGilzr4GXSmCvMUqExuhU"
            }
        ],
        "enabled": true
    },
    "media_metadata": null,
    "media": null
}

Comment row (when searchIncludeComments is true) — illustrative:

{
    "id": "t1_xyz789",
    "parsedId": "xyz789",
    "dataType": "comment",
    "query": "Cheesecake",
    "url": "https://www.reddit.com/r/Baking/comments/abc123/title/def456/",
    "postId": "t3_abc123",
    "parentId": "t3_abc123",
    "username": "commenter",
    "body": "Comment text",
    "createdAt": "2025-01-15T14:00:00.000Z",
    "scrapedAt": "2026-04-16T10:00:01.000Z",
    "upVotes": 3,
    "numberOfreplies": 0
}

Full-field sample

For every key on the first object in data.json (including all preview.images[0].resolutions[] entries), inspect that file in the repository or re-export from a fresh run. Newer runs may add merged compatibility fields (dataType, scrapedAt, reddit_fullname, thread_url, imageUrls, …) not present in older exports.

Key Output Fields

Group	Examples	Meaning
Identity	`kind`, `id`, `query`	Post vs comment, short id, search line.
Content	`title`, `body`, `author`, `created_utc`	Headline, selftext, author, ISO time.
Engagement	`score`, `upvote_ratio`, `num_comments`	Reddit score, ratio, comment count.
Community	`subreddit`, `subreddit_name_prefixed`, `subreddit_id`, `subreddit_subscribers`	Bare name, `r/…`, `t5_…`, subscriber snapshot.
URLs	`url`, `permalink`, `canonical_url`, `old_reddit_url`	Outbound link, path, www thread, old.reddit thread.
Flags	`over_18`, `is_self`, `spoiler`, `locked`, `is_video`, `post_hint`	NSFW, text post, spoiler, lock, video, render hint.
Media	`thumbnail`, `preview`, `media_metadata`, `media`	Thumb; nested previews; gallery dict; embed.
Job context	`search_scope`, `subreddit_search`, `subreddit_fetch_mode`, sorts	When the mapper adds subreddit/global metadata.
Compatibility	`dataType`, `scrapedAt`, `reddit_fullname`, `parsedId`, `username`, `upVotes`, `thread_url`, `imageUrls`	Extra aliases on the same object for downstream tools.

Comments: postId, parentId, communityName, category, html, authorFlair, userId, etc.

FAQ

Which URLs work?
searchGlobal uses keywords, not URLs. searchSubreddit needs a subreddit URL or r/name. subredditUsers needs subreddit home URLs in startUrls.

Posts or users?
Search modes → posts (+ optional comments). Legacy mode → usernames in API_DETAILS.

Are comments always in the dataset?
No. Turn on searchIncludeComments (or legacy scrapeComments). maxItems caps posts only; use searchMaxCommentsPerPost / maxComments for comment volume.

Why are some fields null?
Data is whatever Reddit returns for that post type and state.

Private or logged-in content?
Not supported — public JSON only.

Sentiment or categories?
Not implemented.

Support

Issues — use the Issues tab for this actor in the Apify Console.
Website — https://muhamed-didovic.github.io/
Email — muhamed.didovic@gmail.com
Store — https://apify.com/memo23

Additional Services

Customization or full dataset delivery: muhamed.didovic@gmail.com
Other scraping needs or actor changes: muhamed.didovic@gmail.com
API-style usage: muhamed.didovic@gmail.com

Explore More Scrapers

Browse the author’s Apify store: https://apify.com/memo23

Reddit Comment Scraper — Post Comments & Subreddit Monitoring

automly/reddit-comment-scraper

Extract comments from specific Reddit posts or from the top posts of any subreddit. Supports all Reddit comment sort modes. Residential proxy required for reliable access.

Automly

Reddit Scraper

optimus-fulcria/reddit-scraper

Scrape Reddit posts, comments, and subreddit data. Full nested comment threads, search queries, user profiles.

Fulcria Labs

Reddit Subreddit Users Scraper

codingfrontend/reddit-subreddit-users-scraper

Extract all unique users (post authors + comment authors) from a subreddit, with optional full profile details for each user.

Coding Frontned

Reddit Subreddit Scraper

khadinakbar/reddit-subreddit-scraper

Scrape Reddit subreddit posts, metadata, scores, media, comment counts, and optional top-comment previews. MCP optimized for SEO, social listening, and AI agents. $0.002/post.

Khadin Akbar

Reddit Post Search (Pythia)

apricot_blackberry/pythia-reddit

Search Reddit posts via the public search.json endpoint. Returns up to 50 posts per query with subreddit, score, comment count, and permalink. Optional subreddit scoping. No auth required.

Creator Fusion

Reddit Scraper

alwaysprimedev/reddit-scraper

Scrape Reddit posts, threads, and comments from any subreddit, search, or user — clean structured JSON, fast.

Always Prime

Reddit Posts Search Scraper

dami_studio/reddit-search-scraper

Searches Reddit by keyword across the site or one subreddit; returns matching posts as JSON with title, author, subreddit, URL and date. For product research.

Dami's Studio

5.0

Reddit Posts & Subreddit Comment Scraper

taroyamada/reddit-data-scraper

Scrape Reddit posts and nested comment trees from specific subreddits. Proxy-aware fallback for the legacy public surface. Sort by hot, top, new, rising with optional comment depth control.

naoki anzai

Reddit Post & Comment Scraper

miccho27/reddit-post-scraper

Scrape Reddit posts and comments from any subreddit or thread URL. Extract titles, scores, authors, comment trees, and metadata. No Reddit API key or OAuth required.

Tatsuya Mizuno

Reddit Subreddit Scraper — Posts, Scores & Comment Counts

maged120/reddit-subreddit

Scrape posts from any Reddit subreddit. Get titles, scores, comment counts, authors, timestamps, and links. Supports hot, new, top, and rising sort orders.