Reddit Scraper
Pricing
Pay per usage
Reddit Scraper
Scrape Reddit posts from any subreddit — search by keyword, browse new/hot/top, get full post text and comments. No login, no API key, no browser. Fast HTTP-only.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
kane liu
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
19 hours ago
Last modified
Categories
Share
Extract Reddit post data at scale — browse subreddit feeds, run keyword searches, and pull top comments from public Reddit JSON endpoints. No login, no OAuth, no browser, and no Reddit API key required.
Features
- Subreddit feed scraping - scrape
new,hot, ortopposts from any subreddit with automatic pagination - Keyword search - search within one or more subreddits using Reddit's public search endpoint
- Top comments - optionally fetch top-level comments for each post to capture discussion context
- Multi-subreddit runs - scrape several subreddits in one actor run
- Fast and lightweight - HTTP-only extraction via
curl_cffi, no browser overhead - Auto-deduplication - results are deduplicated by post ID across subreddit and query combinations
- Proxy-ready - supports Apify proxy configuration for larger runs and rate-limit mitigation
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
subreddits | array of strings | — | Subreddit names to scrape, without the r/ prefix. Required. |
searchQueries | array of strings | — | Keywords to search inside each subreddit. If empty, the actor scrapes the subreddit feed directly. |
sort | string | "new" | Sort order: new, hot, top, or relevance. relevance is only meaningful when searchQueries is used. |
timeFilter | string | "week" | Time range filter for top and search results: hour, day, week, month, year, all. |
maxResults | integer | 100 | Maximum posts to return per subreddit, or per subreddit + query combination. Range: 1-5000. |
includeComments | boolean | false | Fetch top-level comments for each post. Slower but adds discussion context. |
maxComments | integer | 10 | Maximum number of top-level comments to include per post when comments are enabled. |
proxy | object | — | Apify proxy configuration. Recommended for large runs to avoid rate limiting. |
Output Fields
| Field | Type | Description |
|---|---|---|
postId | string | Reddit post ID |
subreddit | string | Source subreddit name |
title | string | Post title |
url | string | Full Reddit post URL |
author | string | Reddit username of the post author |
body | string | Self-post text body, empty for link posts or removed content |
score | integer | Reddit score |
upvoteRatio | number | Upvote ratio reported by Reddit |
numComments | integer | Total number of comments on the post |
flair | string | Post flair text if present |
createdAt | string | Post creation time in ISO 8601 format |
isNsfw | boolean | Whether the post is marked NSFW |
isSelf | boolean | Whether the post is a self post |
thumbnail | string | Thumbnail URL or Reddit thumbnail marker |
externalUrl | string | External destination URL for link posts |
scrapedAt | string | Time when this actor scraped the record |
comments | array | Top-level comments as objects with author, body, score, createdAt |
Usage Examples
Scrape New Posts from a Subreddit
{"subreddits": ["forhire"],"sort": "new","maxResults": 50}
Search Inside Multiple Subreddits
{"subreddits": ["forhire", "freelance", "webscraping"],"searchQueries": ["hiring developer", "need help building"],"sort": "new","timeFilter": "month","maxResults": 100}
Scrape Top Posts with Comments
{"subreddits": ["startups"],"sort": "top","timeFilter": "week","maxResults": 25,"includeComments": true,"maxComments": 5}
Large Run with Proxy
{"subreddits": ["forhire", "freelance", "entrepreneur", "SaaS"],"searchQueries": ["looking for developer", "need automation"],"sort": "relevance","timeFilter": "week","maxResults": 200,"proxy": {"useApifyProxy": true}}
Pricing
This actor is designed to be lightweight and inexpensive: approximately $2 per 1,000 results, using a simple pricing model of start fee + per-result charge.
Notes
- Public data only - the actor reads Reddit's public JSON endpoints. No login, OAuth, or private user data is used.
- Cookie workaround - Reddit's JSON endpoints require a cookie header to be present, but the cookie value itself does not need to be real. This actor uses the minimal cookie
_ = 1. - Rate limits - Reddit applies per-IP rate limiting. Large runs should use Apify proxy rotation for stability.
- Comments cost extra requests - enabling
includeCommentsadds one extra request per post that has comments, so large runs will be slower. - Result scope -
maxResultsapplies independently to each subreddit, or each subreddit + search query combination when search is enabled.
Legal
This actor extracts publicly available Reddit data from endpoints accessible to any visitor on the public web. No authentication or account access is used.