Reddit Scraper avatar

Reddit Scraper

Pricing

Pay per usage

Go to Apify Store
Reddit Scraper

Reddit Scraper

Scrape posts and comments from any Reddit subreddit. Supports multiple subreddits, search, sorting, time filters, and optional comment extraction — no API key required.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Monkey Coder

Monkey Coder

Maintained by Community

Actor stats

1

Bookmarked

10

Total users

3

Monthly active users

9 days ago

Last modified

Share

📡 Reddit Scraper

Scrape posts and comments from any Reddit subreddit — no API key, no login required.

Why use this Actor

Use this Actor when you need Reddit data fast without managing browsers or API keys. It’s a good fit for content research, community monitoring, trend spotting, and lead discovery.

✨ Features

  • Multiple subreddits — Scrape several subreddits in a single run
  • Flexible sorting — Hot, New, Top, Rising, or Controversial
  • Time filters — Past hour, day, week, month, year, or all time
  • Search within subreddits — Filter posts by keyword
  • Comment extraction — Optionally fetch and flatten comment trees
  • Rich post data — Title, author, score, upvote ratio, flair, preview images, awards, and more
  • No API key needed — Uses Reddit's public .json endpoints

🔧 How It Works

  1. The Actor sends HTTP requests to Reddit's public JSON endpoints (e.g., reddit.com/r/python/hot.json)
  2. Posts are extracted from the paginated response (up to 100 per page)
  3. If comments are enabled, each post's comment thread is fetched and the nested tree is flattened
  4. Built-in rate limiting (2–4 second delays) prevents Reddit from blocking requests
  5. Results are pushed to the Apify dataset as flat JSON objects

📥 Input Parameters

ParameterTypeDefaultDescription
subredditsStringpythonComma-separated subreddit names (without r/ prefix)
sortSelecthotSort mode: hot, new, top, rising, controversial
time_filterSelectweekTime range for top/controversial: hour, day, week, month, year, all
max_postsInteger25Max posts per subreddit (1–1000)
search_queryString(empty)Optional keyword search within each subreddit
include_commentsBooleanfalseFetch comments for each post (slower due to rate limiting)
max_comments_per_postInteger10Max comments per post when comments are enabled
comment_sortSelecttopComment sort: best, top, new, controversial, old, qa

📤 Sample Output

Post (without comments)

{
"post_id": "1abc2de",
"title": "Python 3.13 released with major performance improvements",
"author": "guido_van_rossum",
"subreddit": "python",
"score": 4523,
"upvote_ratio": 0.97,
"num_comments": 312,
"url": "https://docs.python.org/3.13/whatsnew/3.13.html",
"permalink": "https://www.reddit.com/r/python/comments/1abc2de/python_313_released/",
"is_self": false,
"selftext": "",
"domain": "docs.python.org",
"flair": "News",
"is_video": false,
"over_18": false,
"spoiler": false,
"stickied": false,
"created_utc": "2025-10-01T14:30:00+00:00",
"thumbnail": "https://b.thumbs.redditmedia.com/...",
"preview_image": "https://preview.redd.it/...",
"total_awards": 5,
"gilded": 2,
"num_crossposts": 3,
"fetched_at": "2025-12-15T10:00:00+00:00"
}

Comment (when include_comments is enabled)

Comments are included as an array in each post's comments field:

{
"comment_id": "k7xyz99",
"post_id": "1abc2de",
"author": "pythonista42",
"body": "The new JIT compiler is incredible. 2x speedup on my workloads.",
"score": 891,
"depth": 0,
"is_submitter": false,
"parent_id": "t3_1abc2de",
"permalink": "https://www.reddit.com/r/python/comments/1abc2de/.../k7xyz99/",
"stickied": false,
"distinguished": "",
"controversiality": 0,
"created_utc": "2025-10-01T15:00:00+00:00",
"fetched_at": "2025-12-15T10:00:30+00:00"
}

🚀 Quick Start

  1. Add one or more subreddit names in subreddits (example: python,javascript).
  2. Turn on include_comments only when you need comment threads.
  3. Run the Actor and check that each item includes post_id, title, subreddit, and fetched_at.

🔍 Verification Steps

  • Run a small test input first: subreddits = python, max_posts = 5.
  • Confirm the dataset contains flat post objects and the fetched_at timestamp.
  • If comments are enabled, verify comments_fetched is present and greater than 0 on at least one post.

🧪 Example Inputs

Basic post scrape

{
"subreddits": "python,javascript",
"sort": "hot",
"max_posts": 5
}

Search + comments

{
"subreddits": "learnprogramming",
"search_query": "api",
"include_comments": true,
"max_posts": 10,
"max_comments_per_post": 5,
"comment_sort": "top"
}

⚠️ Notes

  • Authentication: Uses Reddit OAuth API with application-only auth. Requires a Reddit OAuth client ID (set via REDDIT_CLIENT_ID environment variable).
  • Rate limiting: Reddit allows 100 OAuth requests per minute per client. The Actor uses 1–2 second delays between requests.
  • Post text truncation: selftext is truncated to 2,000 characters; comment body to 3,000 characters. Enable OAuth for longer content.
  • Pagination limit: Reddit caps listing access at ~1,000 posts per subreddit for unauthenticated requests.

💡 Troubleshooting

  • If you see no results, verify the subreddit exists and the spelling is valid.
  • If OAuth fails, set REDDIT_CLIENT_ID in the Actor environment variables.
  • For comment scraping, use smaller max_posts values to keep runs fast and reliable.
  • For private or restricted communities, the Actor may return partial or no data.