Reddit Scraper
Pricing
from $5.00 / 1,000 results
Reddit Scraper
Reddit scraper that extracts posts, comments, communities, and user profiles from any subreddit or search query, so marketers and researchers can collect structured Reddit data without API keys or login.
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer

Kawsar
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Reddit Scraper: Extract Posts, Comments, Communities, and Users
Provide Reddit URLs to scrape directly, or use the search fields to find content by keyword.
Reddit Scraper is a powerful Apify actor that pulls data from Reddit's public API endpoints without needing an account or API key. Feed it a list of Start URLs (subreddits, post links, user profiles) or provide Search Terms to find relevant content instantly. Results are returned as structured data ready for analysis, research, or integration.
Use cases
- Market research: track what Reddit communities say about your product, brand, or competitors by scraping keyword searches on demand
- Content discovery: monitor top posts across subreddits to spot trending topics before they go mainstream
- Sentiment analysis: collect post and comment data for NLP pipelines to measure audience sentiment over time
- Academic research: gather Reddit posts and replies for social media studies without hitting API rate limits
- Lead generation: find active users in niche communities and extract profile information for outreach campaigns
- Competitive intelligence: scrape competitor mentions across subreddits to understand how the community perceives them
What data does this actor extract?
Results are stored in a dataset. Each item has a dataType field (post, comment, community, or user) so you can filter by content type after the run.
Example Reddit post
{"id": "t3_144w7sn","parsedId": "144w7sn","url": "https://www.reddit.com/r/learnprogramming/comments/144w7sn/","username": "kawsarlog","itemTitle": "Best resources for learning web scraping in 2025","communityName": "r/learnprogramming","parsedCommunityName": "learnprogramming","body": "I have been learning Python for 6 months...","numberOfComments": 42,"upVotes": 318,"isVideo": false,"isAd": false,"over18": false,"createdAt": "2024-06-09T05:23:15.000Z","scrapedAt": "2025-03-14T08:00:00.000Z","dataType": "post"}
Example Reddit comment
{"id": "t1_jnhqrgg","parsedId": "jnhqrgg","url": "https://www.reddit.com/r/learnprogramming/comments/144w7sn/.../jnhqrgg/","parentId": "t3_144w7sn","username": "dev_user","category": "learnprogramming","communityName": "r/learnprogramming","body": "httpx and BeautifulSoup are a great combo for static sites.","upVotes": 27,"numberOfReplies": 3,"createdAt": "2024-06-09T06:14:00.000Z","scrapedAt": "2025-03-14T08:00:01.000Z","dataType": "comment"}
Example Reddit community
{"id": "2fwo","name": "t5_2fwo","itemTitle": "Learn Programming","url": "https://www.reddit.com/r/learnprogramming/","communityName": "r/learnprogramming","communityDescription": "A subreddit for all questions related to programming in any language.","numberOfMembers": 5800000,"over18": false,"createdAt": "2010-01-25T00:00:00.000Z","scrapedAt": "2025-03-14T08:00:02.000Z","dataType": "community"}
Example Reddit user
{"id": "c3h2qmv","url": "https://www.reddit.com/user/kawsarlog/","username": "kawsarlog","postKarma": 1204,"commentKarma": 876,"userDescription": "Building things on the web.","over18": false,"createdAt": "2020-04-10T15:13:39.000Z","scrapedAt": "2025-03-14T08:00:03.000Z","dataType": "user"}
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
startUrls | array | — | Reddit URLs to scrape: subreddits, posts, user profiles, or search pages. |
searches | array | — | Keywords to search Reddit for. Returns matching posts, communities, and users. |
scrapeType | string | posts | What data to extract from the provided URLs or keywords: posts, comments, communities, users. |
searchCommunityName | string | — | Restrict keyword search to a specific subreddit (name only, without r/). |
sort | string | hot | Sort order: hot, new, top, relevance, comments. |
time | string | all | Time filter for Top or search: all, hour, day, week, month, year. |
includeNSFW | boolean | false | Include NSFW content in results. |
skipComments | boolean | false | Skip comment extraction when scraping posts. Useful when you only need post data. |
maxItems | integer | 100 | Maximum number of items to collect per run (hard cap: 1000). |
timeoutSecs | integer | 300 | Overall actor timeout in seconds. |
requestTimeoutSecs | integer | 30 | Per-request HTTP timeout in seconds. |
proxyConfiguration | object | Datacenter (Anywhere) | Proxy settings for requests. Supports Datacenter, Residential, Special, and custom proxies. |
How to scrape Reddit by URL
Pass any Reddit URL to startUrls. The actor detects the page type automatically.
Supported URL formats:
- Subreddit:
https://www.reddit.com/r/programming/ - Subreddit with sort:
https://www.reddit.com/r/programming/top/ - Post with comments:
https://www.reddit.com/r/learnprogramming/comments/144w7sn/ - User profile:
https://www.reddit.com/user/spez/ - Search results:
https://www.reddit.com/search/?q=data+scraping
How to search Reddit by keyword
Add one or more terms to searches. Each keyword triggers a Reddit search and returns posts, matching communities, and user profiles. Use searchCommunityName to restrict results to a single subreddit.
Example input
{"startUrls": ["https://www.reddit.com/r/datascience/"],"searches": ["web scraping python"],"sort": "top","time": "month","maxItems": 200,"skipComments": false,"includeNSFW": false,"proxyConfiguration": { "useApifyProxy": true }}
Output fields
| Field | Type | Present on | Description |
|---|---|---|---|
id | string | all | Full Reddit ID with type prefix (e.g. t3_abc). |
parsedId | string | all | Short Reddit ID without prefix. |
url | string | all | Permalink to the item on Reddit. |
username | string | post, comment, user | Reddit username of the author. |
dataType | string | all | Item type: post, comment, community, or user. |
itemTitle | string | post, community | Post title or subreddit display name. |
communityName | string | post, comment, community | Subreddit name with r/ prefix. |
parsedCommunityName | string | post, comment, community | Subreddit name without r/ prefix. |
body | string | post, comment | Post text or comment text. Contains URLs for media posts. |
numberOfComments | integer | post | Number of comments on the post. |
upVotes | integer | post, comment | Score (upvotes minus downvotes). |
isVideo | boolean | post | True if the post has a video attachment. |
isAd | boolean | post | True if the post is a promoted advertisement. |
over18 | boolean | all | True if the item is marked NSFW. |
parentId | string | comment | Reddit ID of the parent post or comment. |
category | string | comment | Subreddit name the comment belongs to. |
numberOfReplies | integer | comment | Number of direct replies. |
name | string | community | Full Reddit internal name (e.g. t5_2fwo). |
headerImage | string | community | URL of the community header or icon image. |
communityDescription | string | community | Public subreddit description. |
numberOfMembers | integer | community | Total subscriber count. |
userIcon | string | user | URL of the user's profile avatar. |
postKarma | integer | user | Total post karma. |
commentKarma | integer | user | Total comment karma. |
userDescription | string | user | User bio text. |
createdAt | string | all | ISO 8601 timestamp of when the item was created on Reddit. |
scrapedAt | string | all | ISO 8601 timestamp of when this actor collected the item. |
How it works
- The actor reads your
startUrlslist andsearcheskeywords from the input. - For each URL, it detects the page type (subreddit, post, user, or search) and calls the appropriate Reddit public JSON endpoint.
- For each search keyword, it searches Reddit posts, communities, and users via the search API.
- Results are parsed into structured records and pushed to the Apify dataset in real time.
- The run stops when
maxItemsortimeoutSecsis reached, whichever comes first.
FAQ
Do I need a Reddit account or API key to use this actor? No. Reddit Scraper uses Reddit's public JSON API, which works without authentication. No credentials are required.
Is it legal to scrape Reddit? Scraping publicly available data is generally legal for personal use, research, and analysis. Review Reddit's terms of service for commercial use cases. Always respect rate limits and avoid collecting personal data beyond what is publicly visible.
Why am I getting fewer results than my maxItems setting?
Reddit's API paginates results. If a subreddit or search has fewer posts than your maxItems limit, the actor returns all available items and stops. Try increasing the time filter or using multiple start URLs to gather more data.
Can I export scraped Reddit data to CSV or Google Sheets? Yes. After a run, you can download the dataset in JSON, CSV, XML, or Excel from the Apify console. You can also connect directly to Google Sheets using the Apify Google Sheets integration.
How do I scrape Reddit comments only?
Point startUrls at a specific post URL. Comments are included by default. Set skipComments to true if you only want the post itself.
Do I need proxies to scrape Reddit?
For most use cases, datacenter proxies work fine. If Reddit throttles your requests, switch to Residential proxies in the proxy picker. The default proxyConfiguration uses Apify Datacenter proxies automatically.
Integrations
Connect Reddit Scraper with other apps and services using Apify integrations. You can push results directly to Google Sheets, trigger runs from Zapier or Make, send notifications to Slack, or sync data to Airbyte. Use webhooks to fire actions the moment results are ready.
Reddit Scraper is a solid tool for anyone who wants structured Reddit data without dealing with the official API rate limits or authentication setup.