Reddit Scraper avatar

Reddit Scraper

Pricing

Pay per usage

Go to Apify Store
Reddit Scraper

Reddit Scraper

Scrape Reddit posts from any subreddit — search by keyword, browse new/hot/top, get full post text and comments. No login, no API key, no browser. Fast HTTP-only.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

kane liu

kane liu

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

19 hours ago

Last modified

Share

Extract Reddit post data at scale — browse subreddit feeds, run keyword searches, and pull top comments from public Reddit JSON endpoints. No login, no OAuth, no browser, and no Reddit API key required.

Features

  • Subreddit feed scraping - scrape new, hot, or top posts from any subreddit with automatic pagination
  • Keyword search - search within one or more subreddits using Reddit's public search endpoint
  • Top comments - optionally fetch top-level comments for each post to capture discussion context
  • Multi-subreddit runs - scrape several subreddits in one actor run
  • Fast and lightweight - HTTP-only extraction via curl_cffi, no browser overhead
  • Auto-deduplication - results are deduplicated by post ID across subreddit and query combinations
  • Proxy-ready - supports Apify proxy configuration for larger runs and rate-limit mitigation

Input

ParameterTypeDefaultDescription
subredditsarray of stringsSubreddit names to scrape, without the r/ prefix. Required.
searchQueriesarray of stringsKeywords to search inside each subreddit. If empty, the actor scrapes the subreddit feed directly.
sortstring"new"Sort order: new, hot, top, or relevance. relevance is only meaningful when searchQueries is used.
timeFilterstring"week"Time range filter for top and search results: hour, day, week, month, year, all.
maxResultsinteger100Maximum posts to return per subreddit, or per subreddit + query combination. Range: 1-5000.
includeCommentsbooleanfalseFetch top-level comments for each post. Slower but adds discussion context.
maxCommentsinteger10Maximum number of top-level comments to include per post when comments are enabled.
proxyobjectApify proxy configuration. Recommended for large runs to avoid rate limiting.

Output Fields

FieldTypeDescription
postIdstringReddit post ID
subredditstringSource subreddit name
titlestringPost title
urlstringFull Reddit post URL
authorstringReddit username of the post author
bodystringSelf-post text body, empty for link posts or removed content
scoreintegerReddit score
upvoteRationumberUpvote ratio reported by Reddit
numCommentsintegerTotal number of comments on the post
flairstringPost flair text if present
createdAtstringPost creation time in ISO 8601 format
isNsfwbooleanWhether the post is marked NSFW
isSelfbooleanWhether the post is a self post
thumbnailstringThumbnail URL or Reddit thumbnail marker
externalUrlstringExternal destination URL for link posts
scrapedAtstringTime when this actor scraped the record
commentsarrayTop-level comments as objects with author, body, score, createdAt

Usage Examples

Scrape New Posts from a Subreddit

{
"subreddits": ["forhire"],
"sort": "new",
"maxResults": 50
}

Search Inside Multiple Subreddits

{
"subreddits": ["forhire", "freelance", "webscraping"],
"searchQueries": ["hiring developer", "need help building"],
"sort": "new",
"timeFilter": "month",
"maxResults": 100
}

Scrape Top Posts with Comments

{
"subreddits": ["startups"],
"sort": "top",
"timeFilter": "week",
"maxResults": 25,
"includeComments": true,
"maxComments": 5
}

Large Run with Proxy

{
"subreddits": ["forhire", "freelance", "entrepreneur", "SaaS"],
"searchQueries": ["looking for developer", "need automation"],
"sort": "relevance",
"timeFilter": "week",
"maxResults": 200,
"proxy": {
"useApifyProxy": true
}
}

Pricing

This actor is designed to be lightweight and inexpensive: approximately $2 per 1,000 results, using a simple pricing model of start fee + per-result charge.

Notes

  • Public data only - the actor reads Reddit's public JSON endpoints. No login, OAuth, or private user data is used.
  • Cookie workaround - Reddit's JSON endpoints require a cookie header to be present, but the cookie value itself does not need to be real. This actor uses the minimal cookie _ = 1.
  • Rate limits - Reddit applies per-IP rate limiting. Large runs should use Apify proxy rotation for stability.
  • Comments cost extra requests - enabling includeComments adds one extra request per post that has comments, so large runs will be slower.
  • Result scope - maxResults applies independently to each subreddit, or each subreddit + search query combination when search is enabled.

This actor extracts publicly available Reddit data from endpoints accessible to any visitor on the public web. No authentication or account access is used.