Reddit Scraper

Pricing

$2.50 / 1,000 results

Try for free

Go to Apify Store

Reddit Scraper

Try for free

Scrape entire subreddits with this crawler. Returns the posts in a subreddit along with their title, text, scores and timestamps etc.

Pricing

$2.50 / 1,000 results

Rating

5.0

(7)

Developer

Crawler Bros

Maintained by Community

Actor stats

Bookmarked

105

Total users

Monthly active users

7 days ago

Last modified

Reddit Subreddit Scraper

An Apify Actor for scraping posts from Reddit subreddits using browser automation with Playwright.

Features

🎯 Scrape multiple subreddits in a single run
📊 Extract comprehensive post data (title, author, score, comments, etc.)
🔄 Support for different sorting methods (hot, new, top, rising, controversial)
⏰ Time filters for "top" and "controversial" posts
📦 No authentication required for public subreddits
💾 Data saved in structured JSON format
🌐 Browser automation bypasses API restrictions
🔄 Automatic pagination support

Input Parameters

The actor accepts the following input parameters:

Parameter	Type	Required	Default	Description
`subreddits`	array	Yes	`["python"]`	List of subreddit names to scrape (without 'r/' prefix)
`maxPosts`	integer	No	`25`	Maximum number of posts to scrape from each subreddit (1-1000)
`sort`	string	No	`"hot"`	How to sort posts: `hot`, `new`, `top`, `rising`, or `controversial`
`timeFilter`	string	No	`"day"`	Time filter for 'top'/'controversial': `hour`, `day`, `week`, `month`, `year`, `all`

Example Input

{
  "subreddits": ["islamabad", "pakistan", "programming"],
  "maxPosts": 50,
  "sort": "hot",
  "timeFilter": "day"
}

Output Fields

The actor extracts the following data for each post:

Subreddit Information

subreddit - Subreddit name (e.g., "islamabad")
subreddit_prefixed - Subreddit name with r/ prefix (e.g., "r/islamabad")

Post Content

post_id - Unique post ID (e.g., "1kql1t5")
post_name - Full post name in Reddit format (e.g., "t3_1kql1t5")
title - Post title
author - Username of the post author
selftext - Text content preview (first 1000 chars, for self posts only)

Engagement Metrics

score - Post score/karma (upvotes minus downvotes)
num_comments - Number of comments on the post

Metadata

domain - Domain of the linked content (e.g., "self.islamabad" for text posts)
is_self_post - Boolean indicating if it's a text post (true) or link post (false)
link_flair - Post flair/tag text (if any)
thumbnail_url - URL of the post thumbnail image (if any)

Timestamps

created_utc - Unix timestamp when the post was created
created_at - ISO 8601 formatted datetime (e.g., "2025-05-19T19:40:28")

Flags

is_stickied - Boolean indicating if the post is stickied/pinned
is_locked - Boolean indicating if the post is locked (no new comments)
is_nsfw - Boolean indicating if the post is marked as NSFW (over 18)

Example Output

{
  "subreddit": "islamabad",
  "subreddit_prefixed": "r/islamabad",
  "post_id": "1kql1t5",
  "post_name": "t3_1kql1t5",
  "title": "Everyone's always asking what to do in Islamabad - I made a list",
  "author": "hafmaestro",
  "selftext": "Note: I have not mentioned normal restaurants and cafes...",
  "score": 595,
  "num_comments": 101,
  "url": "https://old.reddit.com/r/islamabad/comments/1kql1t5/...",
  "permalink": "https://old.reddit.com/r/islamabad/comments/1kql1t5/...",
  "domain": "self.islamabad",
  "is_self_post": true,
  "link_flair": "Islamabad",
  "thumbnail_url": null,
  "created_utc": 1747683628,
  "created_at": "2025-05-19T19:40:28",
  "is_stickied": false,
  "is_locked": false,
  "is_nsfw": false
}

Usage

Local Development

Install dependencies:

pip install -r requirements.txt
playwright install chromium

Set up input in storage/key_value_stores/default/INPUT.json:

{
  "subreddits": ["python"],
  "maxPosts": 25,
  "sort": "hot"
}

Run the actor:
```
$python -m src
```
Check results in storage/datasets/default/

On Apify Platform

Push to Apify:
- Login to Apify CLI: apify login
- Initialize: apify init (if not already done)
- Push to Apify: apify push
Or manually upload:
- Create a new actor on Apify platform
- Upload all files including Dockerfile, requirements.txt, and .actor/ directory
Configure and run:
- Set input parameters in the Apify console
- Click "Start" to run the actor
- Download results from the dataset tab

Technical Details

Browser Automation

Uses Playwright with Chromium browser
Scrapes old.reddit.com for better compatibility and simpler HTML structure
Implements anti-detection measures:
- Custom User-Agent headers
- Disabled automation flags
- Browser fingerprint masking

Features

Automatic pagination: Clicks "next" button to load more posts
Smart selectors: Multiple fallback CSS selectors for reliability
Error handling: Screenshots saved on errors for debugging
Rate limiting: Built-in delays between requests

Performance

Headless browser mode for efficiency
Optimized page load strategy (domcontentloaded)
Configurable wait times and timeouts

Limitations

Only works with public subreddits
Cannot scrape private or restricted communities
Browser automation is slower than direct API calls but more reliable
Selftext preview limited to first 1000 characters

Dependencies

apify>=2.1.0 - Apify SDK for Python
playwright~=1.40.0 - Browser automation framework
beautifulsoup4~=4.12.0 - HTML parsing library

Troubleshooting

Timeout Issues

If you encounter timeout errors:

Check the debug screenshots in the key-value store
Increase timeout values in the code
Verify the subreddit exists and is public

No Posts Found

Verify the subreddit name is correct (without 'r/' prefix)
Check if the subreddit has posts for the selected sort method
Review logs for detailed error messages

License

This actor is provided as-is for scraping public Reddit data in accordance with Reddit's terms of service.

Notes

This scraper uses browser automation to access Reddit's public web interface
Always respect Reddit's robots.txt and terms of service
Use responsibly and avoid overwhelming Reddit's servers
Consider implementing additional rate limiting for large-scale scraping
The actor works best with the Apify platform's infrastructure

Reddit Posts Scraper

vulnv/reddit-posts-scraper

Unlimited Reddit web scraper to crawl posts, comments and subreddits without login.

VulnV

310

5.0

Reddit Posts Search Scraper

vulnv/reddit-posts-search-scraper

Search and scrape Reddit posts by keyword. Extract detailed post data, comments, scores, timestamps, and metadata for research and analysis.

VulnV

106

5.0

Reddit Subreddit Scraper

stefanie-rink/reddit-subreddit-scraper

Get Posts and comments from any subreddit - with sorting options for both posts and comments.

Steafanie Braid

5.0

Reddit Scraper

epctex/reddit-scraper

Tap into the wealth of Reddit's data with our Reddit Scraper. Extract valuable insights from posts, subreddits, comments, and user data effortlessly. Simplify analysis and gain valuable insights from the diverse Reddit community with our user-friendly and efficient tool.

epctex

1.4K

5.0

SubReddit Posts Search Scraper

easyapi/subreddit-posts-search-scraper

Scrape Reddit search results from any subreddit with advanced filtering options. Extract post titles, URLs, votes, comments, timestamps and more. Perfect for market research, content analysis and trend monitoring.

EasyApi

5.0

Reddit Subreddit Scraper

backhoe/reddit-subreddit-scraper

Reddit Subreddit Scraper is your plug-and-play radar for Reddit communities: it harvests fresh stats from 100+ subreddits via Apify Residential proxies, returns clean JSON, and drops straight into AI pipelines or dashboards within minutes.

5.0

Reddit Posts Search Scraper 🔍📥 - Cheap

scrapestorm/reddit-posts-search-scraper---cheap

🔍 Easily Collect Reddit Post Data by Keyword Enter a keyword to fetch Reddit posts with title, URL, votes, comments, subreddit, and more. 📝 Integrate effortlessly with tools like Google Drive or Zapier to automate workflows and boost productivity. ⚡📊

Storm_Scraper

5.0

canadesk/reddit

Collect subreddit posts, search for keyword or users, and more from reddit.com! It's fast and costs little.

Canadesk Support

Reddit Scraper | All-In-One | $12 / mo

fatihtahta/reddit-scraper-fast

All-in-one Reddit Scraper. Scrape posts and full comment threads from any search, subreddit, user, or direct post URL. This enterprise-grade scraper is the fastest in the market and delivers clean and detailed JSON.

Fatih Tahta

5.0

Reddit API Scraper

comchat/reddit-api-scraper

Reddit Scraper is a powerful tool that allows you to extract data from Reddit such as posts by keyword. With Reddit Scraper, you can easily gather valuable information from Reddit without the need to log in. You can easily use this Reddit scraper as an alternative API.

Comchat

1.4K

1.0

Reddit Scraper

Reddit Scraper

Reddit Subreddit Scraper

Features

Input Parameters

Example Input

Output Fields

Subreddit Information

Post Content

Engagement Metrics

Links

Metadata

Timestamps

Flags

Example Output

Usage

Local Development

On Apify Platform

Technical Details

Browser Automation

Features

Performance

Limitations

Dependencies

Troubleshooting

Timeout Issues

No Posts Found

License

Notes

You might also like

Reddit Posts Scraper

Reddit Posts Search Scraper

Reddit Subreddit Scraper

Reddit Scraper

SubReddit Posts Search Scraper

Reddit Subreddit Scraper

Reddit Posts Search Scraper 🔍📥 - Cheap

Reddit

Reddit Scraper | All-In-One | $12 / mo

Reddit API Scraper