Pricing

from $1.00 / 1,000 results

Try for free

Go to Apify Store

Reddit Scraper

Try for free

Cheap, fast and reliable . Bring your own proxies

Pricing

from $1.00 / 1,000 results

Rating

5.0

(1)

Developer

DaddyAPI

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Reddit Scraper (Cheerio)

Fast, Robust, and Cost-Effective. Scrape Reddit posts from any subreddit with advanced sorting, smart pagination, and proxy flexibility.

🚀 Why this scraper?

Many Reddit scrapers are slow, heavy, or get blocked easily. This actor is designed for performance and stability.

Lightweight: Uses Cheerio (raw HTTP) instead of heavy browsers, making it 10x cheaper to run.
Smart Pagination: Automatically reverse-engineers Reddit's internal pagination to fetch "more posts" efficiently.
Proxy Freedom: Works with Datacenter proxies (cheap) and Residential proxies (reliable).
Rich Data: Extracts detailed post metrics (upvotes, comments), media links, and text content.

Perfect for:

📈 Trend Analysis: Monitor trending topics.
📢 Sentiment Analysis: Analyze user discussions and opinions.
🤖 AI Training: Gather diverse text datasets for LLMs.
📢 Brand Monitoring: Track mentions of your brand across communities.

📖 How to Use

Option 1: Apify Console (No Coding)

Go to the Input tab.
Enter the Subreddit Name (e.g., technology, funny, dataisbeautiful).
(Optional) Select Sort By (Hot, New, Top, Rising).
Proxy Selection:
- Use Datacenter (Default) for speed/cost.
- Switch to Residential if you see "403 Forbidden" errors.
Click Start.
Download your data in JSON, CSV, or Excel.

Option 2: API (Developers)

You can trigger this actor programmatically via REST API, Python, or Node.js.

Input Payload (JSON)

{
    "subreddit": "technology",
    "sort": "hot", 
    // Options: "hot", "new", "top", "rising"
    
    "maxRequestsPerCrawl": 1,
    // 1 Request ≈ 1 Page (or batch of ~25 posts). 
    // Set to 2 for ~28 posts. Set to 10 for ~200+ posts.

    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"] // Optional
    }
}

🐍 Python Example (Simple & Clean)

This script runs the scraper and saves the results to a local file.

import json
from apify_client import ApifyClient

# 1. Configuration
APIFY_TOKEN = 'YOUR_APIFY_TOKEN'
ACTOR_ID = 'daddyapi/reddit-cheerio-scraper'

client = ApifyClient(APIFY_TOKEN)

# 2. Define Input
run_input = {
    "subreddit": "artificial",
    "sort": "top",
    "maxRequestsPerCrawl": 1, # Fetches approx 50-75 posts
    "proxyConfiguration": {
        "useApifyProxy": True,
        # Uncomment below to use Residential proxies if Datacenter gets blocked
        # "apifyProxyGroups": ["RESIDENTIAL"] 
    }
}

print(f"🚀 Starting scraper for r/{run_input['subreddit']}...")

# 3. Run Actor
run = client.actor(ACTOR_ID).call(run_input=run_input)

if not run:
    print("❌ Failed to start run.")
    exit(1)

print(f"✅ Run finished! Status: {run['status']}")

# 4. Fetch & Save Results
dataset_client = client.dataset(run["defaultDatasetId"])
items = dataset_client.list_items().items

filename = "reddit_data.json"
with open(filename, "w", encoding="utf-8") as f:
    json.dump(items, f, indent=2, ensure_ascii=False)

print(f"💾 Saved {len(items)} posts to {filename}")

🔒 Proxy Configuration (Bring Your Own Proxies)

This actor is fully compatible with Apify Proxy (Datacenter & Residential) and Custom Proxies.

1. Datacenter Proxies (Cost-Effective)

Great for high-volume users who want to control costs. Note that Reddit sometimes blocks these.

{
    "proxyConfiguration": {
        "useApifyProxy": true
    }
}

2. Residential Proxies (Best Reliability)

Recommended. Residential proxies are harder to block and provide the highest success rate. Use this if you are getting empty results or 403 errors.

{
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}

3. Bring Your Own Proxies (Custom URLs)

If you have proxies from an external provider (Webshare, BrightData, Smartproxy, etc.), you can pass the connection strings directly.

{
    "proxyConfiguration": {
        "useApifyProxy": false,
        "proxyUrls": [
            "http://username:password@my-proxy.example.com:8000",
            "http://username:password@my-proxy-2.example.com:8000"
        ]
    }
}

📊 Data Output

The scraper returns structured data for every post:

{
  "post_kind": "t3",
  "author": "TechEnthusiast",
  "author_id": "t2_8a7b3c",
  "time_posted": "2023-10-27T10:00:00.000Z",
  "title": "The Future of AI in 2026",
  "body_text": "Here is a deep dive into what we can expect...",
  "permalink": "/r/technology/comments/18x9z/the_future_of_ai/",
  "comment_count": "452",
  "score": "1500",
  "content_href": "https://i.redd.it/example_image.jpg",
  "external_links": ["https://openai.com/blog", "https://wired.com/ai-news"]
}

🛡️ Troubleshooting

Why am I only getting 3 posts?
- Reddit's initial page load is small. The scraper simulates scrolling by fetching subsequent batches.
"Request blocked (403)" Error?
- Reddit aggressively blocks Datacenter IPs.
- Fix: Switch to Residential Proxies in the input configuration.
Scraper stops early?
- Ensure you have enough memory allocated (256MB is usually enough, but 512MB is safer for very long crawls).

⚖️ Legal & Ethics

This scraper is for educational and analytical purposes. Please respect Reddit's Terms of Service and robots.txt. Do not use this tool to spam communities or overload their servers. Use responsible rate limits.

Reddit scraper

curious_coder/reddit-scraper

Scrape reddit posts and comments from reddit search and communities

Curious Coder

325

Fast Reddit Scraper

timgreen/fast-reddit-scraper

Extract Reddit posts and comments from any subreddit or search query. Fast, reliable Reddit scraping with detailed metadata including upvotes, timestamps, and nested comment threads.

Tim Green

112

Reddit Post Scraper

hello.datawizards/Reddit-Post-Scraper

The Reddit Post Scraper Apify Actor extracts detailed Reddit post and comment data in JSON, ideal for social media analysis, market research, and SEO insights. Supports customizable limits, residential proxies, and scalable scraping. Built by DataWizards for fast, reliable data collection.

datawizards

Reddit Email Scraper - Advanced, Fast & Cheapest

contacts-api/reddit-email-scraper-fast-advanced-and-cheapest

👽 Reddit Email Scraper lets you collect emails from Reddit profiles and linked sources 🔎 Ideal for community research and brand outreach 📧

Lead Heaven

Reddit API Scraper

comchat/reddit-api-scraper

Reddit Scraper is a powerful tool that allows you to extract data from Reddit such as posts by keyword. With Reddit Scraper, you can easily gather valuable information from Reddit without the need to log in. You can easily use this Reddit scraper as an alternative API.

Comchat

1.6K

4.0

Reddit Email Scraper

clothefobia/reddit-email-scraper

Reddit Email Scraper- Scrap Reddit profile emails from search engine using keyword based

clothe fobia

Reddit Email Scraper

bhansalisoft/reddit-email-scraper

Reddit Email Scraper- Scrap Emails from Reddit specific profile using google search engine

bhansalisoft

174

Reddit Video Downloader

thenetaji/reddit-video-downloader

Looking for a fast, reliable, and high-quality Reddit video downloader? This Apify actor is designed to download Reddit videos in their original quality