Pricing

$7.99/month + usage

Reddit Posts & Comments Scrape

Unlock the power of Reddit’s vast community discussions with this high-performance data extraction tool. Whether you are tracking trends, performing sentiment analysis, or generating niche leads, the Reddit Posts & Comments Scraper provides clean, structured data in seconds.

Pricing

$7.99/month + usage

Rating

0.0

(0)

Developer

Scrape Pilot

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

🚀 Reddit Posts & Comments Scraper

📖 About

The Reddit Posts & Comments Scraper is a professional-grade reddit scraper tool designed to efficiently extract public posts, comments, and metadata from Reddit subreddits. Whether you're conducting market research, sentiment analysis, or building data-driven applications, this reddit scraper provides reliable and structured data extraction capabilities.

This tool is built with scalability and compliance in mind, respecting Reddit's API guidelines while delivering high-performance data extraction for developers, researchers, and businesses.

✨ Features

Feature	Description
🎯 Targeted Scraping	Extract posts from specific subreddits with custom filters
💬 Comment Extraction	Optional comment scraping for deeper insights
🔒 Proxy Support	Residential & datacenter proxy configuration included
📊 Rich Metadata	Get scores, upvote ratios, authors, flairs, and more
🔄 Multiple Sort Options	Sort by hot, new, top, rising, and controversial
⏱️ Time Filtering	Filter posts by hour, day, week, month, year, or all time
📁 Multiple Formats	Export data in JSON, CSV, or XML formats
🚀 High Performance	Optimized for large-scale data extraction
🛡️ Rate Limiting	Built-in rate limiting to avoid IP bans
📝 Detailed Logging	Comprehensive logging for debugging and monitoring

📦 Installation

Prerequisites

Python 3.8 or higher
pip (Python package manager)
Reddit API credentials (optional but recommended)

Step-by-Step Installation

# 1. Clone the repository
git clone https://github.com/yourusername/reddit-scraper.git
cd reddit-scraper

# 2. Create a virtual environment (recommended)
python -m venv venv

# 3. Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# 4. Install dependencies
pip install -r requirements.txt

# 5. Verify installation
python reddit_scraper.py --version

Requirements File (`requirements.txt`)

requests>=2.28.0
praw>=7.7.0
pandas>=1.5.0
beautifulsoup4>=4.11.0
lxml>=4.9.0
python-dotenv>=1.0.0
apify-client>=1.0.0

⚡ Quick Start

Basic Usage

from reddit_scraper import RedditScraper

# Initialize the scraper
scraper = RedditScraper()

# Define your configuration
config = {
    "include_comments": False,
    "subreddit": "technology",
    "sort": "hot",
    "time_filter": "all",
    "max_results": 25
}

# Run the scraper
results = scraper.scrape(config)

# Export to JSON
scraper.export_to_json(results, "output.json")

# Export to CSV
scraper.export_to_csv(results, "output.csv")

Command Line Usage

# Basic scrape
python reddit_scraper.py --subreddit technology --max-results 25

# With comments
python reddit_scraper.py --subreddit technology --include-comments --max-results 50

# With custom sort and time filter
python reddit_scraper.py --subreddit technology --sort top --time-filter week --max-results 100

# With proxy configuration
python reddit_scraper.py --subreddit technology --use-proxy --proxy-group RESIDENTIAL

⚙️ Configuration

Input Parameters

Parameter	Type	Required	Default	Description
`subreddit`	string	✅ Yes	-	Target subreddit name (e.g., "technology")
`include_comments`	boolean	❌ No	`false`	Whether to scrape comments for each post
`sort`	string	❌ No	`"hot"`	Sort order: `hot`, `new`, `top`, `rising`, `controversial`
`time_filter`	string	❌ No	`"all"`	Time range: `hour`, `day`, `week`, `month`, `year`, `all`
`max_results`	integer	❌ No	`25`	Maximum number of posts to scrape (1-1000)
`proxyConfiguration.useApifyProxy`	boolean	❌ No	`false`	Enable Apify proxy service
`proxyConfiguration.apifyProxyGroups`	array	❌ No	`[]`	Proxy groups: `RESIDENTIAL`, `DATACENTER`

Example Configuration

{
  "include_comments": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  },
  "subreddit": "technology",
  "sort": "hot",
  "time_filter": "all",
  "max_results": 25
}

📥 Input/Output Format

Input Example

{
  "include_comments": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  },
  "subreddit": "technology",
  "sort": "hot",
  "time_filter": "all",
  "max_results": 25
}

Output Example

[
  {
    "post_id": "1rt52qa",
    "title": "Meta planning sweeping layoffs as AI costs mount",
    "text": null,
    "score": 4506,
    "upvote_ratio": 0.97,
    "url": "https://www.reuters.com/business/world-at-work/meta-planning-sweeping-layoffs-ai-costs-mount-2026-03-14/",
    "permalink": "https://www.reddit.com/r/technology/comments/1rt52qa/meta_planning_sweeping_layoffs_as_ai_costs_mount/",
    "author": "joe4942",
    "subreddit": "technology",
    "flair": "Business",
    "num_comments": 569,
    "awards": 0,
    "is_video": false,
    "domain": "reuters.com",
    "thumbnail": "https://external-preview.redd.it/...",
    "created_at": "1773448769"
  }
]

Output Fields Description

Field	Type	Description
`post_id`	string	Unique Reddit post identifier
`title`	string	Post title
`text`	string/null	Self-post text content (null for link posts)
`score`	integer	Total upvotes minus downvotes
`upvote_ratio`	float	Percentage of upvotes (0.0 - 1.0)
`url`	string	Original link URL (for link posts)
`permalink`	string	Reddit post permalink
`author`	string	Post author username
`subreddit`	string	Subreddit name
`flair`	string/null	Post flair text
`num_comments`	integer	Number of comments on the post
`awards`	integer	Total awards received
`is_video`	boolean	Whether the post is a video
`domain`	string/null	Domain of the linked content
`thumbnail`	string/null	Thumbnail image URL
`created_at`	string	Unix timestamp of post creation

🔌 API Reference

Class: `RedditScraper`

Constructor

scraper = RedditScraper(api_credentials=None, rate_limit=True)

Parameter	Type	Default	Description
`api_credentials`	dict	`None`	Reddit API credentials (client_id, client_secret)
`rate_limit`	boolean	`True`	Enable automatic rate limiting

Methods

Method	Parameters	Returns	Description
`scrape(config)`	`config: dict`	`list`	Main scraping method
`export_to_json(data, filename)`	`data: list, filename: str`	`bool`	Export data to JSON file
`export_to_csv(data, filename)`	`data: list, filename: str`	`bool`	Export data to CSV file
`export_to_xml(data, filename)`	`data: list, filename: str`	`bool`	Export data to XML file
`validate_config(config)`	`config: dict`	`bool`	Validate configuration parameters
`get_subreddit_info(name)`	`name: str`	`dict`	Get subreddit metadata

💡 Examples

Example 1: Scrape Top Posts from r/technology

config = {
    "subreddit": "technology",
    "sort": "top",
    "time_filter": "week",
    "max_results": 50
}

results = scraper.scrape(config)
print(f"Scraped {len(results)} posts")

Example 2: Scrape with Comments

config = {
    "subreddit": "programming",
    "include_comments": True,
    "sort": "hot",
    "max_results": 10
}

results = scraper.scrape(config)

for post in results:
    print(f"Post: {post['title']}")
    print(f"Comments: {len(post.get('comments', []))}")

Example 3: Multiple Subreddits

subreddits = ["technology", "programming", "artificial"]

for subreddit in subreddits:
    config = {
        "subreddit": subreddit,
        "max_results": 25
    }
    results = scraper.scrape(config)
    scraper.export_to_json(results, f"{subreddit}_posts.json")

Example 4: With Proxy Configuration

config = {
    "subreddit": "technology",
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"]
    },
    "max_results": 100
}

results = scraper.scrape(config)

🔐 Proxy Support

This reddit scraper supports advanced proxy configurations to avoid rate limiting and IP bans.

Supported Proxy Types

Proxy Type	Description	Best For
`RESIDENTIAL`	Real user IP addresses	High-volume scraping
`DATACENTER`	Datacenter IP addresses	Fast, cost-effective scraping

Proxy Configuration

{
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"],
    "apifyProxyCountry": "US"
  }
}

Environment Variables

# .env file
APIFY_API_TOKEN=your_apify_token_here
PROXY_ENABLED=true
PROXY_GROUP=RESIDENTIAL

⏱️ Rate Limiting

To ensure responsible usage and avoid bans, this reddit scraper includes built-in rate limiting:

Action	Rate Limit	Recommendation
API Requests	60/minute	Use proxy for higher limits
Post Scraping	100/minute	Enable delays between requests
Comment Scraping	50/minute	Use residential proxies

Rate Limit Configuration

scraper = RedditScraper(
    rate_limit=True,
    rate_limit_delay=1.0,  # seconds between requests
    max_retries=3
)

🛠️ Troubleshooting

Common Issues

Issue	Solution
429 Too Many Requests	Enable proxy, increase delay between requests
403 Forbidden	Check subreddit privacy settings, use API credentials
Empty Results	Verify subreddit name, check sort/time_filter values
Connection Timeout	Enable proxy, check network connection
Invalid JSON Output	Validate input configuration format

Debug Mode

# Enable verbose logging
python reddit_scraper.py --subreddit technology --debug

# Check API status
python reddit_scraper.py --status-check

Log Files

Logs are saved in ./logs/scraper.log by default. Configure log level:

import logging
logging.basicConfig(level=logging.DEBUG)

🤝 Contributing

We welcome contributions! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Setup

# Clone your fork
git clone https://github.com/yourusername/reddit-scraper.git

# Install dev dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Run linting
flake8 .
black .

Code Style

Follow PEP 8 guidelines
Add docstrings for all functions
Write unit tests for new features
Update documentation for changes

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

Copyright (c) 2026 Reddit Scraper

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

❓ FAQ

Q: Is this reddit scraper legal?

A: Yes, this tool only scrapes publicly available data and respects Reddit's API terms of service. Always use responsibly and comply with Reddit's guidelines.

Q: Do I need Reddit API credentials?

A: Not required, but recommended for higher rate limits and better reliability. You can get free API credentials at Reddit Apps.

Q: Can I scrape private subreddits?

A: No, this reddit scraper only works with public subreddits. Private subreddits require authentication and are not supported.

Q: What's the maximum number of posts I can scrape?

A: Technically unlimited, but we recommend staying under 1000 posts per run to avoid rate limiting. Use pagination for larger datasets.

Q: Does this work with Reddit's new API changes?

A: Yes, this tool is updated regularly to comply with Reddit's API changes. Check the releases page for the latest version.

Q: Can I scrape comments recursively?

A: Yes, enable include_comments: true in your configuration. Note that this increases API calls significantly.

Q: How do I report bugs or request features?

A: Please open an issue on our GitHub Issues page with detailed information.

📞 Support

Documentation: https://reddit-scraper.readthedocs.io
Issues: https://github.com/yourusername/reddit-scraper/issues
Discussions: https://github.com/yourusername/reddit-scraper/discussions
Email: support@reddit-scraper.com

🙏 Acknowledgments

Reddit API - For providing the data access
PRAW - Python Reddit API Wrapper
Apify - Proxy services
All contributors and supporters

📊 Usage Statistics

Reddit Scraper Stats

Reddit Post & Comment Scraper

ionbelei549/reddit-parsed-posts

Scrape unlimited comments from any posts with 99% accuracy (highest of the Apify Store). Input any Reddit post URL and get complete, rich JSON data, including deeply nested comment threads, scores, author details, and awards. Comments tree is already built for you.

Ion Belei

Reddit Scraper — Posts, Comments & Subreddit Data

sovereigntaylor/reddit-scraper

Extract Reddit posts, comments, and subreddit data via public API. Scrape titles, scores, authors, comment threads, dates, and flairs. Sort by hot, new, top, or rising. Perfect for market research, sentiment analysis, and content monitoring. No login required.

Ricardo Akiyoshi

Reddit Comments Scraper

shahidirfan/reddit-comments-scraper

Extract detailed comments and discussion threads from Reddit instantly. Perfect for sentiment analysis, market research, and community monitoring. Get structured data from any post URL efficiently. Residential proxies are recommended for high-volume scraping stability.

Shahid Irfan

Reddit Post Comments

shareze001/reddit-post-comments

This tool scrapes comments from a specific Reddit post based on user-defined parameters. It retrieves detailed information such as the author's name, comment content, score, and more.

shareze

Reddit Scraper Pro

harshmaur/reddit-scraper-pro

Reddit Scraper Pro is a powerful, unlimited scraping for $20/mo for extracting data from Reddit. Scrape posts, users, comments, and communities with advanced search capabilities. Perfect for brand monitoring, trend tracking, and competitor research. Supports make, n8n integrations

Harsh Maur

1.6K

4.6

(8)

Reddit Scraper

automation-lab/reddit-scraper

Scrape Reddit posts, comments, search results, and user profiles. Extract structured data from any subreddit with pagination, nested comments, and configurable depth. Export to JSON, CSV, or Excel.

Stas Persiianenko

168

4.3

(2)

🔥Reddit Scraper - Posts, Comments & Subreddit Data Extractor

nourishing_courier/reddit-scraper-pro

Scrape Reddit posts, comments, and subreddit data. Extract upvotes, authors, timestamps, and nested replies. No API keys or login needed. Export to JSON, CSV, Excel. Pay per result - no monthly fees.

Ani Björkström

138

Reddit Scraper

muscular_quadruplet/reddit-scraper

Scrape Reddit posts, comments, and subreddits. Extract discussions, upvotes, user data. Monitor brand mentions, research market trends, analyze communities. No login required. Supports any subreddit.

Do It

Reddit Scraper

labrat011/reddit-scraper

Scrape Reddit posts, comments, search results, and user profiles. No API keys or browser needed. Supports 4 modes: subreddit posts (hot/new/top/rising), Reddit search, user profiles, and full comment trees. Fast, lightweight HTTP-based scraping with built-in rate limiting and retry logic.

Mick

Reddit Scraper - Posts, Comments & Subreddit Data

alizarin_refrigerator-owner/reddit-scraper

Scrape Reddit posts, comments, and subreddit data. Search across Reddit, extract discussions, track trending topics, and monitor specific communities. Subreddit Scraping Reddit Search Comment Extraction Flexible Sorting Upvote Filtering Webhook Support - Send results to Zapier, Make, n8n

The Howlers

Reddit Posts & Comments Scrape

🚀 Reddit Posts & Comments Scraper

📋 Table of Contents

📖 About

✨ Features

📦 Installation

Prerequisites

Step-by-Step Installation

Requirements File (requirements.txt)

⚡ Quick Start

Basic Usage

Command Line Usage

⚙️ Configuration

Input Parameters

Example Configuration

📥 Input/Output Format

Input Example

Output Example

Output Fields Description

🔌 API Reference

Class: RedditScraper

Constructor

Methods

💡 Examples

Example 1: Scrape Top Posts from r/technology

Example 2: Scrape with Comments

Example 3: Multiple Subreddits

Example 4: With Proxy Configuration

🔐 Proxy Support

Supported Proxy Types

Proxy Configuration

Environment Variables

⏱️ Rate Limiting

Rate Limit Configuration

🛠️ Troubleshooting

Common Issues

Debug Mode

Log Files

🤝 Contributing

Development Setup

Code Style

📄 License

❓ FAQ

Q: Is this reddit scraper legal?

Q: Do I need Reddit API credentials?

Q: Can I scrape private subreddits?

Q: What's the maximum number of posts I can scrape?

Q: Does this work with Reddit's new API changes?

Q: Can I scrape comments recursively?

Q: How do I report bugs or request features?

📞 Support

🙏 Acknowledgments

📊 Usage Statistics

You might also like

Reddit Post & Comment Scraper

Reddit Scraper — Posts, Comments & Subreddit Data

Reddit Comments Scraper

Reddit Post Comments

Reddit Scraper Pro

Reddit Scraper

🔥Reddit Scraper - Posts, Comments & Subreddit Data Extractor

Reddit Scraper

Reddit Scraper

Reddit Scraper - Posts, Comments & Subreddit Data

Requirements File (`requirements.txt`)

Class: `RedditScraper`