Reddit Scraper avatar

Reddit Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Reddit Scraper

Reddit Scraper

🔎 Reddit Scraper (reddit-scraper) extracts posts, comments, authors, flair, upvotes & timestamps from subreddits and threads—fast, real-time & reliable. 📊 Perfect for social listening, market research, trend analysis & sentiment. ⚡ Clean JSON/CSV output. 🚀 API-ready.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScraperX

ScraperX

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

16 days ago

Last modified

Share

Reddit Scraper

The Reddit Scraper is a fast, reliable Reddit scraping tool that extracts public posts, comments, subreddit listings/metadata, and user activity via Reddit’s public JSON endpoints — no login required. It solves the need to scrape subreddit posts and download Reddit comments at scale for social listening, market research, and analytics. Built for marketers, developers, data analysts, and researchers, it works as a production-grade reddit comment scraper and reddit post scraper that plugs into your automation stack — from a reddit scraping script to a full reddit scraper python pipeline. Launch one-off runs or build always-on pipelines for Reddit data extraction at scale.

What data / output can you get?

Below are real fields produced by this Reddit Scraper. During a run, it pushes per-type items (search, subreddit, post, user, subreddit_info) and finally a single aggregated object with metadata and segmented data. You can export results as JSON, CSV, or Excel from the Apify dataset UI (or via API).

Data fieldDescriptionExample value
titlePost title included in search and subreddit listings“Show HN: Simple Python scraper”
subredditCommunity where the post or comment belongs“python”
authorPost or comment author (user)“some_redditor”
scorePost/comment score342
num_commentsNumber of comments on a post57
urlCanonical post URLhttps://www.reddit.com/r/python/comments/abc123/…”
permalinkReddit permalink for posts and commentshttps://reddit.com/r/python/comments/abc123/…”
created_utcCreation timestamp (UTC, seconds)1711987200
upvote_ratioUpvote ratio for a post (post items)0.95
commentsParsed comment tree for a post (top-level + nested replies up to depth)[ { “id”: “c1”, “author”: “another_user”, … } ]
profileUser profile info (name, karma, created){ “name”: “spez”, “total_karma”: 1000, … }
subreddit_infoPopular/new/specific subreddit metadata{ “popular”: [ { “name”: “python”, … } ], … }

Notes:

  • Bonus subreddit metadata includes: subscribers, active_users, created_utc, subreddit_type, url, icon_img, banner_img, community_icon.
  • Results are exported to the run’s dataset. Download as JSON, CSV, or Excel, or access via the Apify API for downstream analytics and dashboards.

Key features

  • 🚀 Bulk URL ingestion & smart routing
    Paste lists of post, subreddit, or user URLs. The actor detects each and runs the right module (post + comments, subreddit listing, or user profile) — a scalable subreddit scraper for batch jobs.

  • 🔎 Powerful search with sort & time filters
    Run keyword search across Reddit or within a community. Configure sortSearch (relevance, new, hot, top, comments) and timeFilter (hour, day, week, month, year, all) to target fresh or trending posts.

  • 🧵 Post + comment tree parsing
    Fetch full posts with parsed comment trees up to maxCommentDepth — ideal to scrape subreddit posts and download Reddit comments for sentiment and topic analysis.

  • 👤 User profiles, submissions, and comments
    Collect user profile (karma, created_utc), submitted posts, and comments — a reliable reddit comment scraper for user-level activity analysis.

  • 🏷️ Subreddit listings & metadata
    Scrape subreddit posts by sort (hot, new, top, rising, controversial) and enrich with subreddit_info (popular, new, and specific community metadata) for comprehensive Reddit web scraping.

  • 🧰 Fine‑grained limits & pacing
    Control maxItemsToSave, limitPostsPerPage, maxCommentDepth, maxItemsPerUser, and requestDelaySeconds to balance speed and reliability for your reddit scraping tool.

  • 🌐 Automatic proxy fallback (no login needed)
    Starts direct. On block or rate‑limit, it switches to Apify datacenter proxy; if still blocked, tries residential proxy with up to 3 retries — then continues with whichever works.

  • 🧑‍💻 Developer‑friendly & automation‑ready
    Plug the dataset into your workflows via the Apify API, whether you’re building a reddit scraper python integration, a node.js reddit scraper, or replacing a PRAW reddit scraper / pushshift reddit scraper with a no‑login alternative.

  • 🧱 Production‑ready infrastructure
    Built with resilient retries, pagination, and proxy strategy for reliable, continuous Reddit data extraction in pipelines.

How to use Reddit Scraper - step by step

  1. Sign up or log in to your Apify account.
  2. Open Apify Console → Actors → “Reddit Scraper” and click Try it.
  3. Add input:
    • Start URLs: paste Reddit post (/comments/…), subreddit (/r/…), and/or user (/user/…) URLs.
    • Or run keyword search: add Search Term(s) and set Ignore start URLs to search only; optionally add Community and set sort/time filters.
  4. Configure toggles:
    • Enable subreddit posts / post + comments / user scraper / subreddit info as needed.
    • Use Skip comments or Skip user posts if you want to limit scope.
  5. Set limits and depth:
    • maxItemsToSave, limitPostsPerPage, maxCommentDepth, maxItemsPerUser, and requestDelaySeconds.
  6. Optional: Proxy Configuration
    • By default, the actor starts direct and auto‑fallbacks to datacenter → residential on blocks. You can preconfigure Apify Proxy if desired.
  7. Run the actor.
    • Watch Logs for progress. The dataset will receive per‑type items during the run (e.g., type: “search”, “subreddit”, “post”, “user”, “subreddit_info”).
  8. Download results.
    • Open the Dataset tab to export JSON/CSV/Excel or consume via the Apify API in your reddit scraping script or analytics stack.

Pro Tip: Chain this Reddit scraping tool with webhooks, Make/n8n, or your own API to power dashboards and enrichment — perfect for replacing a brittle reddit scraper GitHub script with a robust backend.

Use cases

Use caseDescription
Market research — scrape subreddit postsAnalyze trends by collecting hot/new/top posts from target communities with scores, authors, and timestamps.
Conversation mining — reddit comment scraperParse comment trees from relevant posts to extract sentiment, themes, and FAQs for content strategy.
Competitive intelligence — export reddit postsTrack product discussions and feedback across subreddits over time and export to CSV/JSON for dashboards.
User activity analysis — download reddit commentsAggregate a user’s submissions and comments to understand behaviors, interests, or reputation.
Academic studies — reddit data extractionBuild reproducible datasets across keywords and subreddits with controlled limits and time filters.
Data engineering — API pipelineUse the Apify API to stream structured JSON into warehouses/lakes; orchestrate with n8n/Make/Zapier.
Brand monitoring — reddit web scrapingSchedule runs to monitor mentions and collect recent posts/comments for alerting and reporting.

Why choose Reddit Scraper?

This Reddit Scraper focuses on precision, scale, and reliability — everything you need for production Reddit web scraping.

  • ✅ Accurate, structured output using Reddit’s public JSON endpoints
  • 🌍 Scalable batch scraping (URLs + keyword search) with robust pagination
  • 🔌 Developer access via Apify API; works seamlessly with reddit scraper python and node.js reddit scraper workflows
  • 🧭 Flexible filters (sortSearch, timeFilter) and depth controls (maxCommentDepth)
  • 🔒 No login required; collects only publicly available data
  • 🛡️ Automatic proxy fallback (direct → datacenter → residential with retries) for resilience
  • 💾 Easy export (JSON/CSV/Excel) for BI tools and downstream analytics

Compared to fragile browser extensions or ad hoc scripts, this production-ready reddit scraping script alternative delivers consistent, structured results without micromanagement.

Yes — when done responsibly. This actor fetches only public data and does not require login or access private content.

  • Only public endpoints are used (Reddit’s public JSON format).
  • Do not attempt to access private profiles or authenticated resources.
  • Respect Reddit’s terms and rate limits; set requestDelaySeconds as needed.
  • Ensure compliance with applicable data protection laws (e.g., GDPR/CCPA) and consult your legal team for your specific use.

Input parameters & output format

Example JSON input

{
"startUrls": [
"https://www.reddit.com/r/python/",
"https://www.reddit.com/r/datascience/comments/abc123/example_post/"
],
"skipComments": false,
"skipUserPosts": false,
"skipCommunity": false,
"searchTerms": ["python scraping", "apify reddit"],
"searchCommunity": "python",
"ignoreStartUrls": true,
"searchForPosts": true,
"searchForComments": false,
"searchForCommunities": false,
"searchForUsers": false,
"sortSearch": "new",
"timeFilter": "all",
"filterByDate": "",
"enableSearch": true,
"enableSubreddit": true,
"sortSubreddit": "hot",
"enablePost": true,
"enableUser": true,
"enableSubredditInfo": true,
"maxItemsToSave": 10,
"limitPostsPerPage": 10,
"postDateLimit": "",
"limitCommentsPerPage": 10,
"limitCommunityPages": 2,
"limitUserPages": 2,
"pageScrollTimeout": 40,
"maxCommentDepth": 5,
"maxItemsPerUser": 20,
"fetchUserProfile": true,
"fetchUserSubmitted": true,
"fetchUserComments": true,
"fetchUserOverview": false,
"fetchPopularSubreddits": false,
"fetchNewSubreddits": false,
"maxSubredditsInfo": 25,
"requestDelaySeconds": 1,
"proxyConfiguration": { "useApifyProxy": false }
}

Parameters

FieldTypeRequiredDefaultDescription
startUrlsarrayYesOne or more Reddit URLs to scrape. Bulk input supported. Accepts post (/comments/…), subreddit (/r/…), and user (/user/…) URLs.
skipCommentsbooleanNofalseDo not fetch comments for post URLs. Only post-level data is collected.
skipUserPostsbooleanNofalseIgnore user profile URLs; no user data scraped.
skipCommunitybooleanNofalseDo not fetch subreddit/community metadata.
searchTermsarrayNoOne or more search phrases. Used when in search mode (no Start URLs or Ignore start URLs enabled).
searchCommunitystringNoRestrict search to a specific community (name without “r/”).
ignoreStartUrlsbooleanNofalseIgnore Start URLs and run search-only with Search Term(s).
searchForPostsbooleanNotrueInclude posts in search results.
searchForCommentsbooleanNofalseReserved for future use: include comments in the search scope.
searchForCommunitiesbooleanNofalseReserved for future use: include subreddits in the search scope.
searchForUsersbooleanNofalseReserved for future use: include users in the search scope.
sortSearchstring (enum)No"new"Search ordering: relevance, new, hot, top, comments.
timeFilterstring (enum)No"all"Search time window: hour, day, week, month, year, all.
filterByDatestringNoOptional absolute or relative date filter (pattern-validated).
enableSearchbooleanNotrueRun Reddit search when in search mode and Search Term(s) provided.
enableSubredditbooleanNotrueProcess subreddit URLs for post listings by sort.
sortSubredditstring (enum)No"hot"Subreddit listing sort: hot, new, top, rising, controversial.
enablePostbooleanNotrueProcess post URLs for full post + comment tree.
enableUserbooleanNotrueProcess user profile URLs for profile, submitted posts, and comments.
enableSubredditInfobooleanNotrueFetch subreddit metadata and optional popular/new lists.
maxItemsToSaveintegerNo10Global maximum items to collect (applies where relevant).
limitPostsPerPageintegerNo10Max posts to fetch from a single subreddit listing.
postDateLimitstringNoOnly include posts on/after this date (YYYY-MM-DD).
limitCommentsPerPageintegerNo10Max comments per post.
limitCommunityPagesintegerNo2Max listing pages per subreddit/community.
limitUserPagesintegerNo2Max pages for user submitted posts and comments.
pageScrollTimeoutintegerNo40Timeout (seconds) for page/scroll-related steps.
maxCommentDepthintegerNo5Maximum nested comment depth to parse (1–20).
maxItemsPerUserintegerNo20Max submitted posts and max comments per user.
fetchUserProfilebooleanNotrueFetch each user’s profile.
fetchUserSubmittedbooleanNotrueFetch posts submitted by each user.
fetchUserCommentsbooleanNotrueFetch comments made by each user.
fetchUserOverviewbooleanNofalseAlso fetch overview and add a summary per user.
fetchPopularSubredditsbooleanNofalseFetch Reddit’s popular subreddits list for subreddit_info.popular.
fetchNewSubredditsbooleanNofalseFetch new subreddits list for subreddit_info.new.
maxSubredditsInfointegerNo25Max subreddits to fetch for popular/new lists (1–100).
requestDelaySecondsintegerNo1Delay between HTTP requests (seconds).
proxyConfigurationobjectNo{ "useApifyProxy": false }Configure Apify Proxy. By default, runs direct and auto‑fallbacks on block.

Example JSON output

During the run, the actor pushes per‑type items (search, subreddit, post, user, subreddit_info) and finally a single aggregated object with metadata and segmented data.

[
{
"type": "search",
"query": "python scraping",
"community": "python",
"total": 2,
"posts": [
{
"title": "Lightweight Python scraper",
"subreddit": "python",
"author": "some_redditor",
"score": 120,
"num_comments": 15,
"url": "https://www.reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"permalink": "https://reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"created_utc": 1711987200
}
]
},
{
"type": "post",
"postId": "abc123",
"post": {
"id": "abc123",
"title": "Lightweight Python scraper",
"author": "some_redditor",
"subreddit": "python",
"score": 120,
"upvote_ratio": 0.95,
"num_comments": 15,
"created_utc": 1711987200,
"url": "https://www.reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"permalink": "https://reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"is_self": true,
"selftext": "Here is a simple example...",
"link_flair_text": "Discussion",
"over_18": false
},
"comments": [
{
"id": "c1",
"author": "another_user",
"body": "Nice approach!",
"score": 10,
"created_utc": 1711988200,
"depth": 0,
"replies": []
}
],
"total_comments_parsed": 1
},
{
"metadata": {
"timestamp": "2026-04-06T10:00:00Z",
"config": {
"maxItemsToSave": 10,
"limitPostsPerPage": 10,
"maxCommentDepth": 5
}
},
"data": {
"search": {
"python scraping": {
"total": 2,
"posts": [
{
"title": "Lightweight Python scraper",
"subreddit": "python",
"author": "some_redditor",
"score": 120,
"num_comments": 15,
"url": "https://www.reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"permalink": "https://reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"created_utc": 1711987200
}
]
}
},
"subreddit": {
"python": {
"total": 2,
"posts": [
{
"title": "Show HN: Simple Python scraper",
"author": "some_redditor",
"score": 342,
"num_comments": 57,
"url": "https://www.reddit.com/r/python/comments/def456/show_hn_simple_python_scraper/",
"permalink": "https://reddit.com/r/python/comments/def456/show_hn_simple_python_scraper/",
"created_utc": 1711900800,
"is_self": true,
"selftext": "This is a demo..."
}
]
}
},
"posts": {
"abc123": {
"post": {
"id": "abc123",
"title": "Lightweight Python scraper",
"author": "some_redditor",
"subreddit": "python",
"score": 120,
"upvote_ratio": 0.95,
"num_comments": 15,
"created_utc": 1711987200,
"url": "https://www.reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"permalink": "https://reddit.com/r/python/comments/abc123/lightweight_python_scraper/",
"is_self": true,
"selftext": "Here is a simple example...",
"link_flair_text": "Discussion",
"over_18": false
},
"comments": [
{
"id": "c1",
"author": "another_user",
"body": "Nice approach!",
"score": 10,
"created_utc": 1711988200,
"depth": 0,
"replies": []
}
],
"total_comments_parsed": 1
}
},
"users": {
"spez": {
"profile": {
"name": "spez",
"link_karma": 500,
"comment_karma": 500,
"total_karma": 1000,
"created_utc": 1200000000
},
"submitted": [
{
"title": "Announcement",
"subreddit": "redditdev",
"score": 100,
"created_utc": 1711000000,
"permalink": "https://reddit.com/r/redditdev/comments/xyz/announcement/"
}
],
"comments": [
{
"body": "Thanks everyone",
"subreddit": "redditdev",
"score": 42,
"created_utc": 1711100000,
"permalink": "https://reddit.com/r/redditdev/comments/xyz/announcement/"
}
]
}
},
"subreddit_info": {
"popular": [
{
"name": "python",
"title": "Python",
"description": "News and discussions for Python.",
"subscribers": 2500000,
"active_users": 12000,
"created_utc": 1143849600,
"over_18": false,
"subreddit_type": "public",
"url": "https://reddit.com/r/python/",
"icon_img": "https://styles.redditmedia.com/…",
"banner_img": "https://styles.redditmedia.com/…",
"community_icon": "https://styles.redditmedia.com/…"
}
],
"new": [],
"specific": {
"python": {
"name": "python",
"title": "Python",
"description": "News and discussions for Python.",
"subscribers": 2500000,
"active_users": 12000,
"created_utc": 1143849600,
"over_18": false,
"subreddit_type": "public",
"url": "https://reddit.com/r/python/",
"icon_img": "https://styles.redditmedia.com/…",
"banner_img": "https://styles.redditmedia.com/…",
"community_icon": "https://styles.redditmedia.com/…"
}
}
}
}
}
]

Notes:

  • Some fields are optional and may be missing/null depending on your settings. For example, “overview” appears only if fetchUserOverview is true; user “profile” is included only if fetchUserProfile is true; comment trees are included only if Skip comments is false.
  • The final aggregated object does not include a “type” field; per‑type items do.

FAQ

Do I need to log in to use this Reddit Scraper?

No. The actor uses Reddit’s public JSON endpoints and does not require login. It collects only publicly available data.

Can it scrape Reddit comments and full threads?

Yes. When Enable post + comments is on and Skip comments is false, the actor fetches full posts and parses the comment tree up to maxCommentDepth.

Can I restrict search to a specific subreddit?

Yes. Provide Search Term(s) and set Community (optional) to a subreddit name (without “r/”). You can also configure sortSearch and timeFilter to fine‑tune results.

How do I control the volume and depth of data?

Use maxItemsToSave (global cap), limitPostsPerPage for subreddit listings, maxCommentDepth for comment trees, and maxItemsPerUser for user activity. requestDelaySeconds helps manage rate limits.

How does proxy handling work if Reddit blocks requests?

The run starts direct. If blocked or rate‑limited, it automatically switches to Apify datacenter proxy. If that still fails, it tries residential proxy with up to 3 retries and continues with the first working option.

Can I export data for analysis?

Yes. Download results from the Dataset as JSON, CSV, or Excel, or consume via the Apify API. This makes it easy to export Reddit posts and download Reddit comments into your BI stack.

Is this a good alternative to PRAW or Pushshift?

Yes. If you need a no‑login backend for a reddit scraper python workflow, this actor is a strong alternative to maintaining a PRAW reddit scraper or pushshift reddit scraper. It uses Reddit’s public JSON endpoints and returns clean, structured JSON via API.

Can I integrate it with Node.js or Python?

Yes. Results are available via the Apify API, so you can consume them from a node.js reddit scraper, a reddit scraping script, or any backend that reads JSON.

Final thoughts

Reddit Scraper is built for structured, scalable collection of Reddit posts, comment threads, subreddit listings/metadata, and user activity. With flexible filters, depth controls, and a resilient proxy strategy, it’s ideal for marketers, developers, analysts, and researchers. Connect it to your workflows via the Apify API — whether you run a reddit scraper python stack or a node.js reddit scraper — and start extracting smarter Reddit insights today, reliably and at scale.