Reddit Scraper | All-In-One | $1.5 / 1K

Pricing

$1.49 / 1,000 results

Try for free

Go to Apify Store

Reddit Scraper | All-In-One | $1.5 / 1K

Try for free

Reddit All-in-one Scraper. Scrape posts and full comment threads from any search, subreddit, user, or direct post URL. This enterprise-grade scraper is the fastest in the market and delivers clean and detailed JSON.

Pricing

$1.49 / 1,000 results

Rating

5.0

(5)

Developer

Fatih Tahta

Maintained by Community

Actor stats

Bookmarked

812

Total users

259

Monthly active users

1.4 hours

Issues response

18 days ago

Last modified

Reddit Scraper

Slug: fatihtahta/reddit-scraper

Overview

This actor retrieves publicly available Reddit data, including posts and optional comment threads.

Data can be collected using keyword-based search, subreddit-scoped queries, or direct Reddit URLs. Results are returned as structured JSON records with stable field names suitable for analytics, monitoring, and downstream processing.

The actor supports configurable limits for posts and comments and exposes separate record types for posts and comments to simplify ingestion into databases, data warehouses, or application pipelines.

Key Capabilities

Multiple collection modes: Retrieve data via keyword-based search, subreddit-scoped scraping, or direct Reddit URLs (posts, listings, user profiles, and searches).
High-throughput collection: Capable of retrieving 1,000+ posts or 10,000+ comments in under one minute, depending on content structure and limits.
Unrestricted scale: Designed to collect millions of posts or comments across runs without artificial rate limits.
Complete data capture: Ensures a 100% rate for posts and comments reachable within the configured limits.
Operational reliability: Proven 100% successful run rate across production workloads under normal operating conditions.
Rich context extraction: Returns a broader set of metadata than the Reddit official API, including moderation status, awards, media previews, crosspost relationships, and author context.
Deterministic output schema: Emits stable, typed post and comment records with predictable fields suitable for analytics, warehousing, and automated pipelines.
Comment thread traversal: Optionally retrieves full comment trees up to the full depth and maximum count.

Configuration

Field	Type	Default	Notes
`queries`	`string[]`	–	Search terms to look up on Reddit. Ignored when `urls` are provided.
`urls`	`string[]`	–	Specific Reddit URLs to scrape. Takes priority over `queries`.
`scrapeComments`	`boolean`	`false`	Set to `true` to extract comments for each post.
`maxComments`	`number`	`50000`	Maximum comments saved per post when `scrapeComments` is `true`.
`maxPosts`	`number`	`50000`	Limit on posts stored for each query or URL.
`includeNsfw`	`boolean`	`false`	Include results tagged as NSFW.
`sort`	`"relevance" \| "hot" \| "top" \| "new" \| "comments"`	`relevance`	Ordering for search results.
`timeframe`	`"hour" \| "day" \| "week" \| "month" \| "year" \| "all"`	`all`	Time range filter for search results when `sort` is `relevance`, `top`, or `comments`.
`subredditName`	`string`	–	Name of a single subreddit to target (omit the `r/` prefix).
`subredditKeywords`	`string[]`	–	Optional keywords combined with `subredditName` to focus the subreddit search.
`subredditSort`	`"relevance" \| "hot" \| "top" \| "new" \| "comments"`	`relevance`	Ordering applied when searching within the chosen subreddit.
`subredditTimeframe`	`"hour" \| "day" \| "week" \| "month" \| "year" \| "all"`	`all`	Time filter that pairs with `subredditSort` values of `relevance`, `top`, or `comments`.
`strictSearch`	`boolean`	`false`	When `true`, wrap query tokens in quotes and join with `AND` to force Reddit into phrase/AND semantics (cuts noisy matches).
`strictTokenFilter`	`boolean`	`false`	When `true`, discard posts whose title/body/url do not contain every query token as a whole word/phrase. Useful to remove "looming" vs. `loom` false positives.

Usage Examples

Sample Input

{
  "includeNsfw": false,
  "queries": [
    "Cheesecake",
    "Swimming Pool"
  ],
  "scrapeComments": true,
  "sort": "hot",
  "timeframe": "year",
  "urls": [
    "https://www.reddit.com/r/socialmedia/"
  ]
}

Sample Output

Post record (`kind: "post"`)

{
  "kind": "post",
  "query": "Cheesecake",
  "id": "1oiwt3p",
  "title": "My first cheesecake :)",
  "body": "Turned out a bit short but thats ok cause it tasted amazing. ",
  "author": "No_Opportunity_1502",
  "score": 27,
  "upvote_ratio": 0.97,
  "num_comments": 1,
  "subreddit": "Baking",
  "created_utc": "2025-10-29T05:59:38.000Z",
  "url": "https://www.reddit.com/r/Baking/comments/1oiwt3p/my_first_cheesecake/",
  "flair": "No-Recipe Provided",
  "over_18": false,
  "is_self": false,
  "spoiler": false,
  "locked": false,
  "is_video": false,
  "domain": "old.reddit.com",
  "thumbnail": "https://b.thumbs.redditmedia.com/oIOAf9jpp5jUSRjEljGBBvN4EOtH6dJo7sujoeG3Wug.jpg",
  "url_overridden_by_dest": "https://www.reddit.com/gallery/1oiwt3p",
  "media": null,
  "media_metadata": null,
  "gallery_data": {
    "items": [
      { "media_id": "iniej0usqzxf1", "id": 782212827 },
      { "media_id": "qi29mztsqzxf1", "id": 782212828 },
      { "media_id": "fehlpdvsqzxf1", "id": 782212829 }
    ]
  },
  "stickied": false,
  "distinguished": null,
  "total_awards_received": 3,
  "all_awardings": [
    { "count": 1, "name": "Helpful" },
    { "count": 2, "name": "Wholesome" }
  ],
  "gilded": 0,
  "num_crossposts": 1,
  "is_original_content": true,
  "author_fullname": "t2_abcd1234",
  "author_flair_text": "Pro Baker",
  "author_premium": false,
  "selftext_html": "<p>Turned out a bit short...</p>",
  "preview": { "images": [{ "id": "preview-id" }] },
  "secure_media": null,
  "secure_media_embed": null,
  "crosspost_parent_list": null
}

Comment record (`kind: "comment"`)

{
  "kind": "comment",
  "query": "https://www.reddit.com/r/technology/...",
  "id": "k5z1x2y",
  "postId": "t3_1d95j4g",
  "parentId": "t3_1d95j4g",
  "body": "Great analysis, but I think you're underestimating the impact of quantum computing on these timelines.",
  "author": "future_thinker",
  "score": 142,
  "created_utc": "2025-08-05T19:15:22.000Z",
  "url": "https://www.reddit.com/r/technology/comments/1d95j4g/the_state_of_ai_in_2025_a_comprehensive_report/k5z1x2y/",
  "stickied": false,
  "distinguished": null,
  "is_submitter": true,
  "score_hidden": false,
  "controversiality": 0,
  "depth": 0
}

Output Guide

Post fields (kind: "post")

kind: Record type identifier; always "post" for post items. query: The search term or URL that produced this record. id: Reddit short ID of the post. title: Post title text. body: Post body text; empty for link-only posts. author: Username of the post creator. score: Net upvotes the post has received. upvote_ratio: Fraction of positive votes (0–1). num_comments: Number of comments on the post when fetched. subreddit: Name of the subreddit containing the post. created_utc: ISO timestamp when the post was created. url: Direct link to the Reddit post. flair: Subreddit-assigned flair text; null if none. over_18: Whether the post is marked NSFW; null when unavailable. is_self: True for text posts; false for link/gallery posts; null if unknown. spoiler: Indicates spoiler-tagged posts; null if not provided. locked: Whether new comments are disabled; null if unknown. is_video: True when the post is a native video; null otherwise. domain: Destination domain for link posts; null for self posts. thumbnail: Thumbnail URL when present; often "self" or null for text posts. url_overridden_by_dest: Original outbound URL for link posts; null otherwise. media: Media payload for videos or embeds; null when absent. media_metadata: Per-item metadata for galleries; null otherwise. gallery_data: Gallery structure for multi-image posts; null otherwise. stickied: True for posts pinned to the subreddit; null if not set. distinguished: Moderator/admin marker (e.g., "moderator"); null otherwise. total_awards_received: Count of awards on the post; null if not present. all_awardings: List of award objects applied to the post; null if none. gilded: Number of times the post was gilded; null if missing. num_crossposts: Count of times this post was crossposted; null if unknown. is_original_content: True when marked as original content; null if absent. author_fullname: Internal Reddit user ID for the author; null when hidden. author_flair_text: Author’s flair text; null if none. author_premium: Whether the author has Reddit Premium; null if not provided. selftext_html: HTML-rendered body for text posts; null for links or when unavailable. preview: Image preview metadata for media posts; null otherwise. secure_media: Media details for secure embeds; null when absent. secure_media_embed: Embed metadata for secure media; null when absent. crosspost_parent_list: Source post data for crossposts; null if not a crosspost.

Comment fields (kind: "comment")

kind: Record type identifier; always "comment" for comment items. query: The search term or URL that produced this record. id: Reddit short ID of the comment. postId: ID of the parent post. postUrl: Direct link to the parent post. parentId: ID of the parent comment or post. body: Comment text content. author: Username of the comment creator. score: Net upvotes the comment has received. created_utc: ISO timestamp when the comment was created. url: Direct link to the comment. stickied: True for comments pinned by moderators; null otherwise. distinguished: Moderator/admin marker (e.g., "moderator"); null otherwise. is_submitter: True when the commenter is also the post author; null if unavailable. score_hidden: True when scores are hidden temporarily; null if not provided. controversiality: Reddit controversy score; null when absent. depth: Nesting level in the thread; null if missing.

Pricing

The actor costs $1.50 per 1,000 saved items (posts or comments). Infrastructure and residential proxy expenses are included, and you only pay for successful results.

Example: scraping 10,000 posts and 25,000 comments equals 35,000 saved items, which costs (35,000 / 1,000) * $1.50 = $52.50.

Operational Tips

Control run time and cost: Enable scrapeComments when comment-level analysis is required.
Scale in batches for large datasets: For large historical backfills or multi-million item collections, split runs by subreddit, time range, or URL lists to simplify retries and monitoring.
Use limits defensively: Set maxPosts and maxComments to explicit values to prevent unbounded runs, especially when scraping high-activity subreddits.
Leverage subreddit mode for monitoring: Subreddit-scoped scraping is well-suited for recurring monitoring jobs where consistent coverage and ordering matter.
Account for sorting behavior: Time filters apply only to compatible sort modes (e.g., top, relevance, comments). Ensure sort and timeframe combinations are aligned with your data goals.

Ethics & Compliance

The scraper collects only publicly available Reddit data and avoids private information. Ensure you have a legitimate reason to process any personal data returned in the results.

Support

Need help or a custom request? Open an issue via the Apify Console Issues tab and it'll be resolved around the clock.

Happy scrapings!
-Fatih

Reddit Scraper | All-In-One | $12 / mo

fatihtahta/reddit-scraper-fast

All-in-one Reddit Scraper. Scrape posts and full comment threads from any search, subreddit, user, or direct post URL. This enterprise-grade scraper is the fastest in the market and delivers clean and detailed JSON.

Fatih Tahta

5.0

Reddit Explorer 3.1

jupri/reddit

💫 All-in-One Reddit.com Scraper 🟪🟦🟩🟨🟧🟥

cat

288

Reddit API Scraper

comchat/reddit-api-scraper

Reddit Scraper is a powerful tool that allows you to extract data from Reddit such as posts by keyword. With Reddit Scraper, you can easily gather valuable information from Reddit without the need to log in. You can easily use this Reddit scraper as an alternative API.

Comchat

1.4K

1.0

Fast Reddit Scraper

timgreen/fast-reddit-scraper

Extract Reddit posts and comments from any subreddit or search query. Fast, reliable Reddit scraping with detailed metadata including upvotes, timestamps, and nested comment threads.

Tim Green

Reddit Post Scrapper

deepanshusharm/reddit-post-scrapper

A Reddit post scraper is a tool or script that automatically collects data from Reddit posts—such as titles, content, comments.

Deepanshu Sharma

Reddit Posts Scraper

api-empire/reddit-posts-scraper

Extract structured Reddit post data at scale. This actor gathers titles, scores, authors, dates, and engagement stats from any subreddit. Ideal for analysts, marketers, and developers who need reliable Reddit insights for monitoring or automation.

API Empire

Reddit Posts Search Scraper

vulnv/reddit-posts-search-scraper

Search and scrape Reddit posts by keyword. Extract detailed post data, comments, scores, timestamps, and metadata for research and analysis.

VulnV

101

5.0

Reddit Posts Scraper

scrapio/reddit-posts-scraper

Get clean, ready to use Reddit post data from any subreddit. The actor captures titles, metadata, upvotes, comment counts, and links. Great for sentiment research, competitor tracking, niche monitoring, or powering AI workflows.

Scrapio

canadesk/reddit

Collect subreddit posts, search for keyword or users, and more from reddit.com! It's fast and costs little.

Canadesk Support

Reddit Scraper

epctex/reddit-scraper

Tap into the wealth of Reddit's data with our Reddit Scraper. Extract valuable insights from posts, subreddits, comments, and user data effortlessly. Simplify analysis and gain valuable insights from the diverse Reddit community with our user-friendly and efficient tool.