Reddit Scraper avatar

Reddit Scraper

Pricing

from $1.95 / 1,000 result scrapeds

Go to Apify Store
Reddit Scraper

Reddit Scraper

Scrape Reddit posts, comments, profiles, and subreddits without API key. Full comment threading, media extraction (video, gallery, embed), flair, awards. NSFW filtering. Hot/new/top/rising sort and search.

Pricing

from $1.95 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

junipr

junipr

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

9 hours ago

Last modified

Share

Scrape Reddit posts, comments, user profiles, and subreddit metadata without a Reddit API key. Features full comment tree threading, media extraction, NSFW filtering, and configurable content filters. Works out of the box with zero configuration — just provide a subreddit name or Reddit URL and start collecting data.

Why Use This Over Reddit API

Reddit deprecated its free API tier in 2023. The official API now costs $0.24 per 1,000 API calls with a $100/month minimum commitment, plus requires OAuth application setup and rate limit quota management.

This actor bypasses the API entirely using web scraping. No API key, no OAuth, no monthly subscription. Additional features the Reddit API does not provide: media URL resolution for external hosts (imgur, gfycat), NSFW content filtering with safe defaults, and structured flair/award/crosspost data that competitors ignore.

FeatureThis Actorepctex Reddit ScraperReddit API (Official)
API key requiredNoNoYes (OAuth)
Monthly subscriptionNoNo$100/mo minimum
Comment threadingFull tree structureFlat listFull tree
Media extractionVideo, gallery, embed, externalImages onlyURLs only
NSFW filteringConfigurable (off by default)NoBasic
Rate limit handlingAuto backoff + rotationManual retryBuilt-in quotas
Cost per 1K posts$1.90~$4.00$0.24 (API calls only)

How to Use

Zero-config example — scrape hot posts from r/technology:

{
"scrapeType": "posts",
"subreddits": ["technology"],
"maxResults": 50
}

Search Reddit for brand mentions:

{
"scrapeType": "search",
"searchQuery": "your brand name",
"sort": "new",
"maxResults": 200
}

Get all comments on a specific post:

{
"scrapeType": "comments",
"urls": ["https://www.reddit.com/r/AskReddit/comments/abc123/your_post_title/"]
}

Scrape top posts from this week with comment trees:

{
"scrapeType": "posts",
"subreddits": ["technology", "programming"],
"sort": "top",
"timeFilter": "week",
"maxResults": 100,
"includeComments": true,
"maxCommentsPerPost": 50
}

Scrape subreddit metadata:

{
"scrapeType": "subreddit",
"subreddits": ["technology", "programming", "webdev"]
}

Input Configuration

ParameterTypeDefaultDescription
scrapeTypestring"posts"What to scrape: posts, comments, subreddit, userProfile, search
urlsstring[][]Direct Reddit URLs (posts, subreddits, users)
searchQuerystringSearch query (required for search type)
subredditsstring[][]Subreddit names without r/
sortstring"hot"Sort: hot, new, top, rising, controversial
timeFilterstring"all"Time filter for top/controversial: hour, day, week, month, year, all
maxResultsinteger100Maximum results (1–100,000)
includeCommentsbooleantrueInclude comments when scraping posts
maxCommentsPerPostinteger100Max comments per post (0–10,000)
commentDepthinteger10Max reply nesting depth (1–20)
includeNsfwbooleanfalseInclude NSFW content
minScoreintegerMinimum post/comment score
excludeAuthorsstring[]["AutoModerator", "[deleted]"]Authors to exclude
flattenCommentsbooleanfalseFlatten comment trees to a flat list
proxyConfigurationobjectApify residentialProxy settings

Output Format

Each post includes full metadata, media, and optionally a threaded comment tree:

{
"id": "t3_abc123",
"title": "Example Post Title",
"body": "Post body text in markdown...",
"author": "reddit_user",
"subreddit": "technology",
"score": 1542,
"upvoteRatio": 0.94,
"numComments": 231,
"createdUtc": "2024-01-15T10:30:00.000Z",
"isNsfw": false,
"media": {
"type": "image",
"images": [{ "url": "https://...", "width": 1920, "height": 1080, "caption": null }],
"video": null,
"embed": null
},
"comments": [
{
"id": "t1_xyz789",
"author": "commenter",
"body": "Great post!",
"score": 45,
"depth": 0,
"isOp": false,
"replies": [
{
"id": "t1_abc456",
"body": "Thanks!",
"depth": 1,
"replies": []
}
]
}
]
}

Comments maintain full tree structure with replies arrays. Set flattenComments: true for a flat list with depth and parentId fields for CSV/spreadsheet use.

Tips and Best Practices

Rate limiting: Reddit aggressively rate-limits scrapers. The default requestDelay of 1000ms is a safe baseline. Lower values increase speed but risk 429 errors. The actor handles rate limits automatically with exponential backoff.

Proxy selection: Reddit blocks datacenter proxies. Use residential proxies (the default) for reliable scraping. The actor rotates IPs on rate limit responses.

Large subreddits: For subreddits with millions of posts, use sort: "top" with a timeFilter to get manageable result sets. Scraping all of r/AskReddit would take days — filter first.

NSFW filtering: NSFW content is excluded by default. Set includeNsfw: true to include it. When scraping NSFW subreddits with the default setting, zero results are returned.

Comment threading: Comments are returned as a tree by default. Each comment has a replies array containing child comments. Use flattenComments: true if you need a flat structure for data analysis.

Pricing

Pay-Per-Event: $1.90 per 1,000 results.

Pricing includes all platform compute costs — no hidden fees.

A "result" is one successfully scraped item pushed to the dataset. A post counts as 1 result. Each comment counts as 1 result. A subreddit info object counts as 1 result.

ScenarioItemsCost
100 posts, no comments100$0.19
100 posts + avg 20 comments each2,100$3.99
Brand monitoring (500 mentions/week)500$0.95/week
Subreddit metadata (10 subreddits)10$0.02
Full thread (1 post + 5K comments)5,001$9.50

Items filtered out by score, NSFW, author, or flair filters are NOT billed. Failed requests are NOT billed.

FAQ

Does this require a Reddit API key?

No. This actor scrapes Reddit's web interface directly. No API key, no OAuth setup, no Reddit developer account needed. It works out of the box.

How does it handle Reddit's rate limiting?

The actor uses exponential backoff when Reddit returns 429 (Too Many Requests) responses. It starts with a 5-second delay and increases to 60 seconds, with random jitter to avoid thundering herd patterns. Proxy IPs are rotated on rate limit hits.

Can I scrape private subreddits?

No. Private subreddits require Reddit account authentication, which this actor does not support. The actor detects private subreddits and reports them as errors without wasting retries.

How do I filter NSFW content?

NSFW content is excluded by default (includeNsfw: false). Set includeNsfw: true to include it. When scraping a known NSFW subreddit with the default setting, you will get zero results.

Does it work for non-English subreddits?

Yes. The actor supports full UTF-8 content including emoji, CJK characters, and other scripts. Content is preserved exactly as posted on Reddit.

Can I download images and videos?

The actor extracts media URLs (images, videos, galleries) by default. Direct download to the Apify key-value store is not currently supported — use the extracted URLs with a separate download tool.

What happens if Reddit blocks the scraper?

The actor detects anti-bot pages and retries with different proxy IPs. With residential proxies (the default), blocks are rare. If all retries fail, the URL is logged in the run summary's failedUrls array.

How are comments structured?

Comments are returned as a tree by default. Each comment has a replies array containing nested child comments, preserving Reddit's thread structure. The depth field indicates nesting level (0 = top-level). Set flattenComments: true for a flat list.