Reddit Posts Scraper avatar

Reddit Posts Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Reddit Posts Scraper

Reddit Posts Scraper

Scrape Reddit posts with ease πŸ§΅πŸ‘½ Extract titles, post text, subreddits, usernames, upvotes, comments, timestamps, and links from Reddit threads. Perfect for trend tracking, sentiment analysis, audience research, and content discovery. Turn Reddit data into actionable insights fast πŸš€

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

Scrapium

Scrapium

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

πŸ€– Reddit Posts Scraper

Scrape posts and comments from Reddit by subreddit, URL, or keyword. Get structured data with automatic proxy fallback. πŸ“Š


πŸ“– What Is This Actor?

🟠 Reddit Posts Scraper is an Apify Actor that extracts public Reddit posts and comments in one run. You can target subreddits, full URLs, or search keywords and receive clean, structured JSONβ€”perfect for research, analytics, NLP, brand monitoring, and automation.

βœ… No coding required β€’ βœ… Proxy fallback (datacenter β†’ residential) β€’ βœ… Retries on blocks & timeouts β€’ βœ… Export to JSON, CSV, or API


🎯 Why Choose This Actor?

⚑ Fast & scalableHandle hundreds of posts per source with parallel comment fetching
🧩 Flexible inputsSubreddits, URLs, or keywordsβ€”one field, multiple formats
πŸ›‘οΈ ReliableAutomatic proxy fallback and retries on 403, 5xx, and timeouts
πŸ“€ Structured outputSubreddit, title, author, score, comments, links, timestamps, and more
πŸ”§ TunableSort order, time filter, post/comment limits, request delay, proxy
😊 Beginner-friendlySimple form in Apify Console; no setup or code needed

πŸ“₯ Input Parameters

Input is grouped into four sections in the Apify Console.

πŸ“ Where to scrape

FieldTypeDescription
🏷️ Reddit URLs / Subreddits / KeywordsList (required)One per line: full URLs (e.g. https://www.reddit.com/r/news/), subreddit names (e.g. news or r/news), or search keywords (e.g. artificial intelligence).

πŸ“Š Sorting & time range

FieldTypeDescription
πŸ”€ Sort orderDropdownHot β€’ New β€’ Top β€’ Rising. How posts are ordered.
⏱️ Time filterDropdownPast hour β€’ Past 24 hours β€’ Past week β€’ Past month β€’ Past year β€’ All time. Only applies when sort order is Top or Rising.

πŸ”’ Limits

FieldTypeDescription
πŸ“„ Maximum posts per sourceNumber (1–1000)Max posts to scrape per subreddit/keyword. Default: 50.
πŸ’¬ Maximum comments per postNumber (0–1000)Max comments to fetch per post. Set to 0 to skip comments. Default: 100.

🌐 Proxy & network

FieldTypeDescription
⏳ Delay between requests (seconds)Number (0–30)Pause between requests to reduce rate limits. A small random delay is added automatically. Default: 1.
πŸ” Proxy configurationProxy pickerChoose proxies (e.g. Apify Proxy). If Reddit blocks the request, the actor falls back to residential proxy. Recommended for large runs.

πŸ“€ Output (Dataset)

Results are saved to the Reddit Posts Data dataset. Each row is one post with the following fields:

ColumnDescription
🏷️ SubredditCommunity name (e.g. news, technology)
πŸ“ TitlePost title
πŸ‘€ AuthorReddit username of the poster
⬆️ ScoreUpvotes / score
πŸ’¬ # CommentsNumber of comments
πŸ“… Posted (UTC)Unix timestamp (UTC)
πŸ”— Link to postPermalink to the Reddit thread
πŸ“„ Post textBody/selftext of the post
πŸ–ΌοΈ ThumbnailThumbnail image URL
πŸ–ΌοΈ ImageMain image URL (if any)
πŸ’¬ CommentsArray of comments (author, body, score, created_utc, replies)
πŸ†” Post IDReddit post ID
βœ… SuccessWhether the post was scraped successfully
⚠️ Error (if any)Error message if the post failed

You can export the dataset as JSON, CSV, or Excel, or use the Apify API to fetch results.


πŸš€ How to Use (Apify Console)

  1. πŸ” Log in at console.apify.com.
  2. πŸ” Find the actor β€” search for Reddit Posts Scraper (or open it from the store).
  3. πŸ“₯ Fill the input:
    • Under Where to scrape, add subreddits, URLs, or keywords (one per line).
    • Optionally set Sort order, Time filter, Limits, and Proxy & network.
  4. ▢️ Run β€” click Start and watch the run log.
  5. πŸ’Ύ Get results β€” open the Output tab, preview the dataset, and Export (JSON/CSV/Excel) or use the API.

✨ Key Features

  • πŸ“Œ Multiple input types β€” Subreddits, full Reddit URLs, or search keywords in one list.
  • πŸ”„ Sort & filter β€” Hot, New, Top, Rising + time range (hour to all time).
  • πŸ“Š Scalable limits β€” Up to 1000 posts per source, up to 1000 comments per post (or 0 to skip comments).
  • πŸ›‘οΈ Proxy fallback β€” No proxy β†’ Datacenter β†’ Residential if Reddit blocks.
  • πŸ” Retries β€” Automatic retries on 403, 429, 5xx (e.g. UPSTREAM503/502), timeouts, and SSL/connection issues.
  • πŸ’Ύ Live saving β€” Data is pushed to the dataset as it’s scraped (partial results kept if the run stops).
  • πŸ“€ Structured JSON β€” Ready for analytics, NLP, dashboards, and integrations (n8n, Zapier, Make, etc.).

🎯 Best Use Cases

Use caseHow this actor helps
πŸ“Š Market & trend researchPull top posts and comments by keyword or subreddit.
🧠 NLP / ML datasetsGet clean text (title, body, comments) for training or analysis.
πŸ“ Content & SEODiscover what people talk about and find content ideas.
πŸ“ˆ Brand monitoringTrack mentions and sentiment across communities.
πŸ“° Journalism & researchGather quotes and discussions from public threads.
πŸ”„ AutomationTrigger runs via API or connect to n8n, Zapier, Google Sheets.

  • βœ… Allowed: Scraping publicly available Reddit content for research, analytics, and insights.
  • ❌ Do not: Scrape private subreddits without permission, misuse personal data, or ignore Reddit’s terms and rate limits.
  • πŸ›‘οΈ This actor is designed for ethical, compliant use of public data only.

πŸ“‹ Input / Output Examples

πŸ“₯ Example input (JSON)

{
"startUrls": [
"https://www.reddit.com/r/news/",
"news",
"artificial intelligence"
],
"sortOrder": "top",
"timeFilter": "week",
"maxPosts": 50,
"maxComments": 100,
"requestDelay": 1,
"proxyConfiguration": { "useApifyProxy": false }
}

πŸ“€ Example output item (one post)

{
"subreddit": "news",
"title": "Example post title",
"author": "username",
"score": 156,
"num_comments": 42,
"created_utc": 1703123456.789,
"permalink": "https://www.reddit.com/r/news/comments/abc123/...",
"body": "Post content...",
"thumbnail_url": "https://...",
"image_url": "https://...",
"comments": [
{
"author": "commenter1",
"body": "Comment text...",
"score": 23,
"created_utc": 1703123456.789,
"replies": []
}
],
"post_id": "abc123",
"success": true,
"error_message": null
}

❓ Frequently Asked Questions

QuestionAnswer
πŸ†“ Is it free?You can run it on Apify’s free plan for small jobs.
πŸ”Œ API-like?Yes β€” output is structured JSON; you can call the actor via Apify API.
πŸ’¬ Comments included?Yes β€” set Maximum comments per post > 0 (or 0 to skip).
πŸ“‚ Multiple subreddits?Yes β€” add as many as you want in Reddit URLs / Subreddits / Keywords.
πŸ›‘οΈ What if Reddit blocks?The actor uses proxy fallback (e.g. residential) and retries.
πŸ‘Ά Need coding?No β€” use the form in Apify Console or send JSON input via API.

πŸ› οΈ Support & Feedback

  • 🐞 Bug reports: Use the repository Issues section.
  • ✨ Custom solutions or feature requests: πŸ“§ dev.scraperengine@gmail.com

βœ… Summary

🟠 Reddit Posts Scraper gives you posts + comments from Reddit by subreddit, URL, or keyword, with sort order, time filter, limits, and proxy support. Output is structured, exportable, and integration-readyβ€”ideal for research, analytics, and automation. πŸš€