Reddit [Only $0.50đź’°] Posts | Users | Scraper avatar

Reddit [Only $0.50đź’°] Posts | Users | Scraper

Pricing

from $0.50 / 1,000 results

Go to Apify Store
Reddit [Only $0.50đź’°] Posts | Users | Scraper

Reddit [Only $0.50đź’°] Posts | Users | Scraper

[Only $0.50💰] Scrape Reddit posts (and optionally full comment threads with replies) from public JSON — site-wide & subreddit search, feed browse, post/comment date filters, NSFW toggle, strict phrase/token match, proxy rotation. Legacy: deduplicated authors from subreddit listing URLs

Pricing

from $0.50 / 1,000 results

Rating

5.0

(1)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

14

Total users

13

Monthly active users

8 hours ago

Last modified

Share

Reddit Scraper

Scrape Reddit posts (and optionally full comment threads with replies) from public JSON — site-wide & subreddit search, feed browse, post/comment date filters, NSFW toggle, strict phrase/token match, proxy rotation. Legacy: deduplicated authors from subreddit listing URLs (key-value store).

How it works

How Reddit Scraper works

Why Use This Scraper?

  • Multiple starting points — site-wide keywords, one subreddit (search or feed browse), or legacy subreddit URLs for author collection.
  • Post-shaped rows — search modes write one dataset item per post, with Reddit’s own nested preview / media when present.
  • Optional comments — separate dataset rows per comment, with configurable caps and morechildren depth.
  • Practical filters — NSFW toggle, post/comment date windows, strict phrase and token filters, “maximize coverage” budgets, residential-friendly proxy support.
  • Export-friendly — default Apify Dataset; download as JSON, CSV, Excel (CSV flattens nested keys with dots).

Overview

This actor is for research, monitoring, content analysis pipelines, and internal tools that need structured Reddit data without maintaining your own scraping stack.

  • Search all of Reddit and Search one subreddit resolve into post rows in the dataset. Enable comments to add comment rows in the same dataset (dataType: "comment", ids like t1_…).
  • User scraper (legacy) targets deduplicated usernames stored under the key-value key API_DETAILS. The dataset is secondary in that mode.

The actor does not compute sentiment or topic categories; related input flags are ignored.

Supported Inputs

You provideWhenNotes
mode = searchGlobalSite-wide searchSet searchQueries (string list). searchSort, searchTimeframe, includeOver18 apply.
mode = searchSubredditOne subredditsubredditSearchUrl (https://…/r/name, r/name, or bare name). Empty subredditSearchQueries → feed listing; non-empty → in-subreddit search.
mode = subredditUsersLegacy authorsstartUrls — subreddit home URLs (e.g. https://www.reddit.com/r/webscraping).
maxItemsAll modesSearch: max posts (split across keyword lines). Legacy: max unique authors.
proxyAll modesUse residential or quality proxies if Reddit serves block pages or throttles.

Search options in the Console schema also include: searchIncludeComments, searchMaxCommentsPerPost, searchListingLimitPerPage, searchPostDateFrom / searchPostDateTo, searchCommentDateFrom / searchCommentDateTo, searchForceNewSortWhenDateFiltered, searchStrictPhrase, searchStrictTokenFilter, searchMaximizeCoverage, maxConcurrency / minConcurrency / maxRequestRetries (legacy crawler), etc.

Advanced JSON-only: searchHttpTransport (internalParallel | internal), searchParallelQueryConcurrency. Legacy aliases still read in code: queries, maxPosts, includeNsfw, scrapeComments, maxComments, dateFrom / dateTo, forceSortNewForTimeFilteredRuns, strictSearch, strictTokenFilter, maximize_coverage.

Not supported: logged-in sessions, guaranteed access to private communities, compliance with your jurisdiction’s rules for Reddit data (your responsibility).

Use Cases

AudienceTypical use
ResearchersPost and comment samples for NLP or social science.
Marketing & brandMention tracking, campaign feedback, subreddit tone.
AgenciesClient-ready Reddit exports on a schedule.
Product & data teamsDashboards fed from Dataset webhooks.
DevelopersBaseline Reddit HTTP + parsing without owning infra.

How It Works

  1. Choose mode and fill only the inputs that apply (Console sections match this).
  2. The actor requests search.json, subreddit listing or search URLs on old.reddit, or (legacy) listing JSON on www.reddit.com.
  3. Each accepted post is mapped to a flat row (legacy Reddit field names plus optional merged aliases such as dataType, scrapedAt, thread_url).
  4. If searchIncludeComments is on, the actor loads thread JSON, walks replies, and may call /api/morechildren until per-post limits and round budgets are reached.
  5. You export the Dataset (and for legacy mode, read API_DETAILS from the Key-value store).

Input Configuration

Search all of Reddit (minimal):

{
"mode": "searchGlobal",
"searchHttpTransport": "internalParallel",
"searchQueries": ["Cheesecake"],
"searchSort": "relevance",
"searchTimeframe": "all",
"includeOver18": false,
"maxItems": 100,
"searchIncludeComments": false,
"proxy": { "useApifyProxy": true }
}

Search one subreddit:

{
"mode": "searchSubreddit",
"searchHttpTransport": "internalParallel",
"subredditSearchUrl": "https://www.reddit.com/r/technology",
"subredditSearchQueries": ["api"],
"subredditSearchSort": "relevance",
"subredditSearchTimeframe": "all",
"maxItems": 50,
"proxy": { "useApifyProxy": true }
}

User scraper (legacy) — author list:

{
"mode": "subredditUsers",
"startUrls": [{ "url": "https://www.reddit.com/r/webscraping" }],
"maxItems": 500,
"maxConcurrency": 5,
"proxy": { "useApifyProxy": true }
}

Output Overview

  • Dataset — one JSON object per line item. Search runs are mostly kind: "post" rows. Comments, when enabled, are additional items with dataType: "comment".
  • Legacy user mode — primary output is API_DETAILS (unique usernames) in the default key-value store, not the same post schema.
  • Downloads — Apify offers JSON, CSV, Excel, etc. Nested objects (e.g. preview.images) flatten in CSV.
  • Honest variability — Reddit omits or nulls fields by post type (text vs link vs gallery, removed authors, ads). Keys are stable when the mapper adds them; values are not.

Output Samples

Shortened post object (real shape; preview…resolutions trimmed for readability). The first record in repo data.json shows the full resolutions ladder.

{
"kind": "post",
"query": "Cheesecake",
"id": "futnih",
"title": "Different kinds of cheesecake",
"body": "",
"author": "cttrv",
"score": 43222,
"upvote_ratio": 0.95,
"num_comments": 1204,
"subreddit": "coolguides",
"created_utc": "2020-04-04T13:23:45.000Z",
"url": "https://i.redd.it/tczgv8mgysq41.jpg",
"permalink": "/r/coolguides/comments/futnih/different_kinds_of_cheesecake/",
"canonical_url": "https://www.reddit.com/r/coolguides/comments/futnih/different_kinds_of_cheesecake/",
"old_reddit_url": "https://old.reddit.com/r/coolguides/comments/futnih/different_kinds_of_cheesecake/",
"flair": null,
"post_hint": "image",
"over_18": false,
"is_self": false,
"spoiler": false,
"locked": false,
"is_video": false,
"domain": "i.redd.it",
"thumbnail": "https://b.thumbs.redditmedia.com/BjwCwDT6OG40X5VFEYWsYSUJ_lrLZvUZizm4q_WG8hk.jpg",
"subreddit_id": "t5_310rm",
"subreddit_name_prefixed": "r/coolguides",
"subreddit_subscribers": 6028638,
"preview": {
"images": [
{
"source": {
"url": "https://preview.redd.it/tczgv8mgysq41.jpg?auto=webp&s=7b17fbc8e9050ee4b242f7ece7de63ae7b0ee43b",
"width": 1200,
"height": 1200
},
"resolutions": [
{
"url": "https://preview.redd.it/tczgv8mgysq41.jpg?width=108&crop=smart&auto=webp&s=f1107cc1f996a0013ad8b321c6453442d5f576ea",
"width": 108,
"height": 108
},
{
"url": "https://preview.redd.it/tczgv8mgysq41.jpg?width=216&crop=smart&auto=webp&s=4f45cf02c049215a7462669b4e25268aba62ff49",
"width": 216,
"height": 216
}
],
"variants": {},
"id": "FJE3K_tyrkJobmnqvZME7_YGilzr4GXSmCvMUqExuhU"
}
],
"enabled": true
},
"media_metadata": null,
"media": null
}

Comment row (when searchIncludeComments is true) — illustrative:

{
"id": "t1_xyz789",
"parsedId": "xyz789",
"dataType": "comment",
"query": "Cheesecake",
"url": "https://www.reddit.com/r/Baking/comments/abc123/title/def456/",
"postId": "t3_abc123",
"parentId": "t3_abc123",
"username": "commenter",
"body": "Comment text",
"createdAt": "2025-01-15T14:00:00.000Z",
"scrapedAt": "2026-04-16T10:00:01.000Z",
"upVotes": 3,
"numberOfreplies": 0
}

Full-field sample

For every key on the first object in data.json (including all preview.images[0].resolutions[] entries), inspect that file in the repository or re-export from a fresh run. Newer runs may add merged compatibility fields (dataType, scrapedAt, reddit_fullname, thread_url, imageUrls, …) not present in older exports.

Key Output Fields

GroupExamplesMeaning
Identitykind, id, queryPost vs comment, short id, search line.
Contenttitle, body, author, created_utcHeadline, selftext, author, ISO time.
Engagementscore, upvote_ratio, num_commentsReddit score, ratio, comment count.
Communitysubreddit, subreddit_name_prefixed, subreddit_id, subreddit_subscribersBare name, r/…, t5_…, subscriber snapshot.
URLsurl, permalink, canonical_url, old_reddit_urlOutbound link, path, www thread, old.reddit thread.
Flagsover_18, is_self, spoiler, locked, is_video, post_hintNSFW, text post, spoiler, lock, video, render hint.
Mediathumbnail, preview, media_metadata, mediaThumb; nested previews; gallery dict; embed.
Job contextsearch_scope, subreddit_search, subreddit_fetch_mode, sortsWhen the mapper adds subreddit/global metadata.
CompatibilitydataType, scrapedAt, reddit_fullname, parsedId, username, upVotes, thread_url, imageUrlsExtra aliases on the same object for downstream tools.

Comments: postId, parentId, communityName, category, html, authorFlair, userId, etc.

FAQ

Which URLs work?
searchGlobal uses keywords, not URLs. searchSubreddit needs a subreddit URL or r/name. subredditUsers needs subreddit home URLs in startUrls.

Posts or users?
Search modes → posts (+ optional comments). Legacy mode → usernames in API_DETAILS.

Are comments always in the dataset?
No. Turn on searchIncludeComments (or legacy scrapeComments). maxItems caps posts only; use searchMaxCommentsPerPost / maxComments for comment volume.

Why are some fields null?
Data is whatever Reddit returns for that post type and state.

Private or logged-in content?
Not supported — public JSON only.

Sentiment or categories?
Not implemented.

Support

Additional Services

Explore More Scrapers

Browse the author’s Apify store: https://apify.com/memo23