Reddit Scraper — Posts, Comments & Subreddit Data avatar

Reddit Scraper — Posts, Comments & Subreddit Data

Pricing

Pay per usage

Go to Apify Store
Reddit Scraper — Posts, Comments & Subreddit Data

Reddit Scraper — Posts, Comments & Subreddit Data

Extract Reddit posts, comments, and subreddit data via public API. Scrape titles, scores, authors, comment threads, dates, and flairs. Sort by hot, new, top, or rising. Perfect for market research, sentiment analysis, and content monitoring. No login required.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ricardo Akiyoshi

Ricardo Akiyoshi

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

2 hours ago

Last modified

Categories

Share

Reddit Thread & Comment Scraper

Scrape Reddit posts, comments, and subreddit data at scale using Reddit's public JSON API. No API keys or authentication required.

What does it do?

This actor extracts structured data from any public subreddit, including:

  • Post data: title, score, author, comment count, flair, URL, self-text, creation date
  • Comment threads: author, body, score, depth, creation date
  • Sorting: hot, new, top (with time filters), rising
  • Search: filter posts by keyword within any subreddit
  • Pagination: automatically follows Reddit's pagination to collect large datasets

Use Cases

Market Research

Analyze what people are discussing in your industry. Track trending topics, pain points, and product feedback across relevant subreddits.

Sentiment Analysis

Collect posts and comments about your brand, product, or competitors. Feed the structured data into NLP pipelines for sentiment scoring.

Brand Monitoring

Monitor mentions of your brand or product keywords. Track sentiment shifts over time by scraping regularly.

Competitive Intelligence

See what users say about competing products. Identify feature gaps, complaints, and switching triggers.

Content Research

Find trending topics and popular content formats in your niche. Analyze what gets high engagement.

Academic Research

Collect large-scale social media datasets for research purposes with structured, clean output.

Input

FieldTypeDefaultDescription
subredditstringprogrammingSubreddit name (without r/ prefix)
searchQuerystring-Optional keyword search within the subreddit
sortenumhotSort order: hot, new, top, rising
timeFilterenumweekTime range for top/search: hour, day, week, month, year, all
maxPostsinteger50Maximum posts to scrape (1-1000)
includeCommentsbooleanfalseWhether to fetch comments for each post
maxCommentsPerPostinteger10Max comments per post (1-500)
proxyobject-Optional proxy configuration

Example Input

{
"subreddit": "machinelearning",
"sort": "top",
"timeFilter": "month",
"maxPosts": 100,
"includeComments": true,
"maxCommentsPerPost": 20
}

Search Example

{
"subreddit": "webdev",
"searchQuery": "React vs Vue",
"sort": "top",
"timeFilter": "year",
"maxPosts": 50,
"includeComments": true,
"maxCommentsPerPost": 15
}

Output

Each post is saved as a structured JSON object:

{
"title": "GPT-5 just dropped and it's incredible",
"score": 4523,
"upvoteRatio": 0.94,
"numComments": 892,
"author": "ai_researcher_42",
"subreddit": "machinelearning",
"url": "https://openai.com/blog/gpt-5",
"selfText": "",
"permalink": "/r/machinelearning/comments/abc123/gpt5_just_dropped/",
"fullUrl": "https://www.reddit.com/r/machinelearning/comments/abc123/gpt5_just_dropped/",
"createdAt": "2026-02-15T14:32:00.000Z",
"createdUtc": 1771339920,
"flair": "Discussion",
"isNSFW": false,
"isSelf": false,
"domain": "openai.com",
"postId": "abc123",
"comments": [
{
"author": "deep_learning_fan",
"body": "The reasoning capabilities are genuinely impressive...",
"score": 1205,
"createdAt": "2026-02-15T14:45:00.000Z",
"createdUtc": 1771340700,
"depth": 0
},
{
"author": "skeptical_dev",
"body": "Has anyone actually benchmarked this vs Claude?",
"score": 834,
"createdAt": "2026-02-15T15:02:00.000Z",
"createdUtc": 1771341720,
"depth": 1
}
],
"scrapedAt": "2026-02-28T10:00:00.000Z"
}

Rate Limiting

This actor respects Reddit's rate limits:

  • ~60 requests per minute without authentication
  • Automatic delays between requests (1-2 seconds)
  • Exponential backoff on 429 (Too Many Requests) responses
  • Proxy support for higher throughput on large scrapes

Pay Per Event Pricing

This actor uses Apify's Pay Per Event model. You are charged per post scraped:

EventPrice
Post scraped (without comments)$0.001
Post scraped (with comments)$0.003

Limitations

  • Only public subreddits can be scraped
  • Reddit's JSON API returns a maximum of ~1000 posts per listing
  • Comments are limited to what Reddit returns in the default comment sort (best)
  • Very large comment threads may be truncated by Reddit's API
  • Rate limiting may slow down large scrapes (use proxy for best results)

Changelog

1.0.0 (2026-02-28)

  • Initial release
  • Subreddit post scraping with sort and time filters
  • Comment extraction with depth tracking
  • Search within subreddits
  • Pagination support
  • Rate limiting with exponential backoff
  • Pay Per Event billing