Reddit Posts Scraper
Pricing
$19.99/month + usage
Reddit Posts Scraper
Scrape Reddit posts with ease π§΅π½ Extract titles, post text, subreddits, usernames, upvotes, comments, timestamps, and links from Reddit threads. Perfect for trend tracking, sentiment analysis, audience research, and content discovery. Turn Reddit data into actionable insights fast π
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
Scrapium
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
π€ Reddit Posts Scraper
Scrape posts and comments from Reddit by subreddit, URL, or keyword. Get structured data with automatic proxy fallback. π
π What Is This Actor?
π Reddit Posts Scraper is an Apify Actor that extracts public Reddit posts and comments in one run. You can target subreddits, full URLs, or search keywords and receive clean, structured JSONβperfect for research, analytics, NLP, brand monitoring, and automation.
β No coding required β’ β Proxy fallback (datacenter β residential) β’ β Retries on blocks & timeouts β’ β Export to JSON, CSV, or API
π― Why Choose This Actor?
| β‘ Fast & scalable | Handle hundreds of posts per source with parallel comment fetching |
| π§© Flexible inputs | Subreddits, URLs, or keywordsβone field, multiple formats |
| π‘οΈ Reliable | Automatic proxy fallback and retries on 403, 5xx, and timeouts |
| π€ Structured output | Subreddit, title, author, score, comments, links, timestamps, and more |
| π§ Tunable | Sort order, time filter, post/comment limits, request delay, proxy |
| π Beginner-friendly | Simple form in Apify Console; no setup or code needed |
π₯ Input Parameters
Input is grouped into four sections in the Apify Console.
π Where to scrape
| Field | Type | Description |
|---|---|---|
| π·οΈ Reddit URLs / Subreddits / Keywords | List (required) | One per line: full URLs (e.g. https://www.reddit.com/r/news/), subreddit names (e.g. news or r/news), or search keywords (e.g. artificial intelligence). |
π Sorting & time range
| Field | Type | Description |
|---|---|---|
| π Sort order | Dropdown | Hot β’ New β’ Top β’ Rising. How posts are ordered. |
| β±οΈ Time filter | Dropdown | Past hour β’ Past 24 hours β’ Past week β’ Past month β’ Past year β’ All time. Only applies when sort order is Top or Rising. |
π’ Limits
| Field | Type | Description |
|---|---|---|
| π Maximum posts per source | Number (1β1000) | Max posts to scrape per subreddit/keyword. Default: 50. |
| π¬ Maximum comments per post | Number (0β1000) | Max comments to fetch per post. Set to 0 to skip comments. Default: 100. |
π Proxy & network
| Field | Type | Description |
|---|---|---|
| β³ Delay between requests (seconds) | Number (0β30) | Pause between requests to reduce rate limits. A small random delay is added automatically. Default: 1. |
| π Proxy configuration | Proxy picker | Choose proxies (e.g. Apify Proxy). If Reddit blocks the request, the actor falls back to residential proxy. Recommended for large runs. |
π€ Output (Dataset)
Results are saved to the Reddit Posts Data dataset. Each row is one post with the following fields:
| Column | Description |
|---|---|
| π·οΈ Subreddit | Community name (e.g. news, technology) |
| π Title | Post title |
| π€ Author | Reddit username of the poster |
| β¬οΈ Score | Upvotes / score |
| π¬ # Comments | Number of comments |
| π Posted (UTC) | Unix timestamp (UTC) |
| π Link to post | Permalink to the Reddit thread |
| π Post text | Body/selftext of the post |
| πΌοΈ Thumbnail | Thumbnail image URL |
| πΌοΈ Image | Main image URL (if any) |
| π¬ Comments | Array of comments (author, body, score, created_utc, replies) |
| π Post ID | Reddit post ID |
| β Success | Whether the post was scraped successfully |
| β οΈ Error (if any) | Error message if the post failed |
You can export the dataset as JSON, CSV, or Excel, or use the Apify API to fetch results.
π How to Use (Apify Console)
- π Log in at console.apify.com.
- π Find the actor β search for Reddit Posts Scraper (or open it from the store).
- π₯ Fill the input:
- Under Where to scrape, add subreddits, URLs, or keywords (one per line).
- Optionally set Sort order, Time filter, Limits, and Proxy & network.
- βΆοΈ Run β click Start and watch the run log.
- πΎ Get results β open the Output tab, preview the dataset, and Export (JSON/CSV/Excel) or use the API.
β¨ Key Features
- π Multiple input types β Subreddits, full Reddit URLs, or search keywords in one list.
- π Sort & filter β Hot, New, Top, Rising + time range (hour to all time).
- π Scalable limits β Up to 1000 posts per source, up to 1000 comments per post (or 0 to skip comments).
- π‘οΈ Proxy fallback β No proxy β Datacenter β Residential if Reddit blocks.
- π Retries β Automatic retries on 403, 429, 5xx (e.g. UPSTREAM503/502), timeouts, and SSL/connection issues.
- πΎ Live saving β Data is pushed to the dataset as itβs scraped (partial results kept if the run stops).
- π€ Structured JSON β Ready for analytics, NLP, dashboards, and integrations (n8n, Zapier, Make, etc.).
π― Best Use Cases
| Use case | How this actor helps |
|---|---|
| π Market & trend research | Pull top posts and comments by keyword or subreddit. |
| π§ NLP / ML datasets | Get clean text (title, body, comments) for training or analysis. |
| π Content & SEO | Discover what people talk about and find content ideas. |
| π Brand monitoring | Track mentions and sentiment across communities. |
| π° Journalism & research | Gather quotes and discussions from public threads. |
| π Automation | Trigger runs via API or connect to n8n, Zapier, Google Sheets. |
βοΈ Legal & Ethical Use
- β Allowed: Scraping publicly available Reddit content for research, analytics, and insights.
- β Do not: Scrape private subreddits without permission, misuse personal data, or ignore Redditβs terms and rate limits.
- π‘οΈ This actor is designed for ethical, compliant use of public data only.
π Input / Output Examples
π₯ Example input (JSON)
{"startUrls": ["https://www.reddit.com/r/news/","news","artificial intelligence"],"sortOrder": "top","timeFilter": "week","maxPosts": 50,"maxComments": 100,"requestDelay": 1,"proxyConfiguration": { "useApifyProxy": false }}
π€ Example output item (one post)
{"subreddit": "news","title": "Example post title","author": "username","score": 156,"num_comments": 42,"created_utc": 1703123456.789,"permalink": "https://www.reddit.com/r/news/comments/abc123/...","body": "Post content...","thumbnail_url": "https://...","image_url": "https://...","comments": [{"author": "commenter1","body": "Comment text...","score": 23,"created_utc": 1703123456.789,"replies": []}],"post_id": "abc123","success": true,"error_message": null}
β Frequently Asked Questions
| Question | Answer |
|---|---|
| π Is it free? | You can run it on Apifyβs free plan for small jobs. |
| π API-like? | Yes β output is structured JSON; you can call the actor via Apify API. |
| π¬ Comments included? | Yes β set Maximum comments per post > 0 (or 0 to skip). |
| π Multiple subreddits? | Yes β add as many as you want in Reddit URLs / Subreddits / Keywords. |
| π‘οΈ What if Reddit blocks? | The actor uses proxy fallback (e.g. residential) and retries. |
| πΆ Need coding? | No β use the form in Apify Console or send JSON input via API. |
π οΈ Support & Feedback
- π Bug reports: Use the repository Issues section.
- β¨ Custom solutions or feature requests: π§ dev.scraperengine@gmail.com
β Summary
π Reddit Posts Scraper gives you posts + comments from Reddit by subreddit, URL, or keyword, with sort order, time filter, limits, and proxy support. Output is structured, exportable, and integration-readyβideal for research, analytics, and automation. π