Reddit Scraper avatar
Reddit Scraper

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Reddit Scraper

Reddit Scraper

Extract Reddit posts, comments, users & communities by keyword search or subreddit. OAuth authentication, NSFW filtering, trending tracker. No login required. Perfect for brand monitoring, market research & sentiment analysis. Export to CSV/JSON/Excel.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

JINGYUAN LIANG

JINGYUAN LIANG

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

17 days ago

Last modified

Categories

Share

๐Ÿ”ฅ Reddit Scraper for Apify

A comprehensive Reddit scraper that extracts posts, comments, user profiles, and trending content from Reddit. Perfect for social media analysis, content research, and data mining.

๐Ÿš€ Quick Start: See ./QUICK-START.md for immediate setup with examples

โœ… OAuth Ready: Authentication tested and working! See ./AUTHENTICATION.md

โœจ Features

  • ๐Ÿ“ Subreddit Post Scraper - Extract posts from any subreddit with sorting options
  • ๐Ÿ’ฌ Comment Thread Extractor - Get all comments from specific posts (nested structure preserved)
  • ๐Ÿ‘ค User Profile Scraper - Extract karma, post history, and user information
  • ๐Ÿ”ฅ Trending Posts Tracker - Scrape viral content from r/popular or r/all
  • ๐Ÿ” OAuth Authentication - Optional Reddit API authentication for higher rate limits
  • โšก Smart Rate Limiting - Built-in delays to respect Reddit's API limits
  • ๐Ÿ“Š Flexible Export - Save data to Apify dataset (CSV, JSON, Excel)

๐Ÿš€ Quick Start

Authentication Options

No API Key Required! The scraper works out of the box using Reddit's public JSON API.

Optional: Use Reddit OAuth for higher rate limits and better reliability.

  • ๐Ÿ“– See ./AUTHENTICATION.md for details
  • Already have credentials? Just set useAuthentication: true in your input!

Mode 1: Subreddit Post Scraper

Scrape posts from any subreddit:

{
"scrapeMode": "subreddit",
"subreddit": "programming",
"sortBy": "hot",
"maxPosts": 100,
"includeComments": false
}

Options:

  • sortBy: hot, new, top, rising
  • timeFilter: hour, day, week, month, year, all (for "top" sort)
  • includeComments: Set to true to fetch comments for each post

Example Output:

{
"type": "post",
"id": "abc123",
"title": "Learn Python in 2024",
"author": "johndoe",
"subreddit": "programming",
"score": 1234,
"num_comments": 45,
"url": "https://www.reddit.com/r/programming/...",
"created_utc": "2024-01-15T10:30:00.000Z",
"selftext": "Here's my guide...",
"upvote_ratio": 0.95
}

Mode 2: Comment Thread Extractor

Extract all comments from a specific post:

{
"scrapeMode": "comments",
"postUrl": "https://www.reddit.com/r/programming/comments/abc123/post_title/",
"maxComments": 200
}

Example Output:

{
"type": "comment",
"id": "xyz789",
"author": "commenter1",
"body": "Great post! Here's my thoughts...",
"score": 56,
"depth": 0,
"created_utc": "2024-01-15T11:00:00.000Z",
"permalink": "https://www.reddit.com/...",
"is_submitter": false
}

Mode 3: User Profile Scraper

Get user karma, posts, and comment history:

{
"scrapeMode": "user",
"username": "spez",
"maxPosts": 50
}

Example Output:

{
"type": "user_profile",
"username": "spez",
"link_karma": 12345,
"comment_karma": 67890,
"total_karma": 80235,
"created_utc": "2005-06-06T00:00:00.000Z",
"is_gold": true,
"is_employee": true
}

Scrape viral content from r/popular:

{
"scrapeMode": "trending",
"sortBy": "hot",
"maxPosts": 100
}

๐Ÿ“‹ Input Parameters

ParameterTypeRequiredDescription
scrapeModeStringโœ…Mode: subreddit, comments, user, or trending
subredditStringFor subreddit modeSubreddit name (without r/)
usernameStringFor user modeReddit username (without u/)
postUrlStringFor comments modeFull URL of Reddit post
sortByStringโŒSort posts by: hot, new, top, rising
timeFilterStringโŒTime filter for top posts
maxPostsIntegerโŒMaximum posts to scrape (default: 100)
maxCommentsIntegerโŒMaximum comments per post (default: 50)
includeCommentsBooleanโŒInclude comments in subreddit scrape
useAuthenticationBooleanโŒUse Reddit OAuth (higher rate limits)
redditClientIdStringFor authReddit app client ID
redditClientSecretStringFor authReddit app client secret
redditUserAgentStringโŒCustom user agent

๐ŸŽฏ Use Cases

1. Market Research

{
"scrapeMode": "subreddit",
"subreddit": "startups",
"sortBy": "top",
"timeFilter": "week",
"maxPosts": 200
}

2. Sentiment Analysis

{
"scrapeMode": "comments",
"postUrl": "https://www.reddit.com/r/technology/comments/...",
"maxComments": 500
}

3. Influencer Analysis

{
"scrapeMode": "user",
"username": "popular_redditor",
"maxPosts": 100
}

4. Content Discovery

{
"scrapeMode": "trending",
"sortBy": "hot",
"maxPosts": 50
}

๐Ÿ› ๏ธ Local Development

Prerequisites

  • Node.js 18+
  • npm or yarn

Installation

# Install dependencies
npm install
# Quick test (no Apify needed)
npm test
# Run locally with Apify CLI
apify run
# Or with Node.js
npm start

Testing with Input

Create input.json in the apify_storage/key_value_stores/default/ folder:

{
"scrapeMode": "subreddit",
"subreddit": "webdev",
"sortBy": "hot",
"maxPosts": 10
}

Then run:

$npm start

๐Ÿš€ Deployment

Quick Deploy to Apify

# Install Apify CLI
npm install -g apify-cli
# Login to Apify
apify login
# Deploy your actor
apify push

๐Ÿ“– Full deployment guide: See ./DEPLOYMENT.md for detailed instructions on:

  • Deploying via CLI, GitHub, or Web Interface
  • Running scheduled scrapes
  • Setting up webhooks and integrations
  • Monitoring and troubleshooting
  • Cost optimization tips

๐Ÿ“Š Data Structure

Post Object

  • type: "post"
  • id: Reddit post ID
  • title: Post title
  • author: Username of author
  • subreddit: Subreddit name
  • score: Upvotes minus downvotes
  • num_comments: Number of comments
  • url: Reddit permalink
  • link_url: External link (if any)
  • selftext: Post body text
  • created_utc: ISO timestamp
  • upvote_ratio: Percentage of upvotes
  • is_video: Boolean
  • over_18: NSFW flag

Comment Object

  • type: "comment"
  • id: Comment ID
  • author: Username
  • body: Comment text
  • score: Comment score
  • depth: Nesting level (0 = top-level)
  • created_utc: ISO timestamp
  • permalink: Direct link to comment
  • parent_id: Parent comment/post ID

User Profile Object

  • type: "user_profile"
  • username: Reddit username
  • link_karma: Post karma
  • comment_karma: Comment karma
  • total_karma: Combined karma
  • created_utc: Account creation date
  • is_gold: Reddit Gold status
  • is_mod: Moderator status

โš™๏ธ Configuration

Rate Limiting

The scraper includes built-in rate limiting (2 seconds between requests) to respect Reddit's guidelines. If rate limited (429 error), it automatically waits 10 seconds and retries.

Proxy Support

Use Apify's proxy configuration in the input:

{
"scrapeMode": "subreddit",
"subreddit": "example",
"proxyConfiguration": {
"useApifyProxy": true
}
}

๐Ÿšจ Important Notes

  1. Reddit Terms of Service: Use responsibly and respect Reddit's ToS
  2. Rate Limits: Reddit has rate limits - the scraper handles this automatically
  3. Authentication: This scraper uses Reddit's JSON API (no auth required)
  4. Data Volume: Large scrapes may take time due to rate limiting
  5. NSFW Content: Filter using the over_18 field if needed

๐Ÿ“ˆ Performance

  • Speed: ~2-3 seconds per request (due to rate limiting)
  • Scale: Can scrape 1000+ posts in a single run
  • Memory: Efficient - streams data to dataset

๐Ÿค Contributing

Feel free to submit issues and enhancement requests!

๐Ÿ“„ License

ISC License - feel free to use for personal or commercial projects.

๐Ÿ”— Resources

๐Ÿ’ก Tips & Tricks

Get Multiple Subreddits

Run the actor multiple times with different inputs, or modify the code to accept an array of subreddits.

Export to Google Sheets

Use Apify's integration to automatically export scraped data to Google Sheets.

Schedule Regular Scrapes

Set up a schedule in Apify to monitor subreddits daily/weekly for trending content.

Combine with Other Tools

  • Export to CSV for Excel analysis
  • Use with Python/Pandas for data science
  • Feed into AI/ML models for NLP tasks
  • Create dashboards with Tableau/Power BI

Happy Scraping! ๐ŸŽ‰

For questions or support, please open an issue in the repository.