Pricing

Pay per usage

Reddit Comments Search Scraper

Scrape Reddit comments by URL or keyword. Returns structured records with subreddit, author, score, comment count, content, and timestamps. Auto-falls-back through direct → datacenter → residential proxies if Reddit rate-limits the request.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Scrapier

Actor stats

Bookmarked

Total users

Monthly active users

19 days ago

Last modified

🔍 Reddit Search Scraper

Scrape Reddit search results and subreddit listings at scale — paste any Reddit URL (search, subreddit, or subreddit search) and the actor pulls clean structured records from public Reddit data archives (no Reddit login or API key required) and live-saves each post to the dataset.

ℹ️ How it works: Reddit shut down unauthenticated access to its public .json endpoints. This actor instead reads from two public Reddit data archives — PullPush (primary, full-text + subreddit search) and Arctic Shift (fallback for subreddit/author queries) — so it keeps working without you registering a Reddit OAuth app.

💡 Built for marketers, researchers, AI/LLM data pipelines, and competitive-intelligence teams who need clean, structured Reddit data without scraping headaches.

✨ Why choose this Actor?

🚀 Fast — pure async HTTP, no headless browser overhead.
🔓 No credentials needed — reads public Reddit archives, so there's no OAuth app, client ID, or rate-limited Reddit key to manage.
🛡️ Smart proxy ladder — starts direct, auto-falls-back to datacenter → residential if an archive rate-limits the request IP, and stays on residential once it kicks in.
🔁 Resilient — per-request retries with jittered backoff, and 3 retries on the residential tier before giving up.
💾 Live saving — every post is pushed to the dataset as it's scraped, so a mid-run crash never loses work.
🧱 Bulk URLs — feed it any number of Reddit URLs in one run.
📊 Pre-built dataset views — Overview, Post, Subreddit, Author, Content, and Full Record tabs in the Apify Console.

🎯 Key features

🌐 Bulk URL input (search URLs, subreddit URLs, subreddit search URLs)
🔎 Optional keyword fallback when no URLs are supplied
📊 Sort by Relevance / Hot / Top / New / Most Comments
🔞 Safe-search toggle
📦 Hard cap on total items via maxItems
🛡️ Default no-proxy, auto-escalating fallback ladder
📝 Detailed real-time logs so you can watch progress live

📥 Input

{
  "urls": [
    { "url": "https://www.reddit.com/search/?q=ai&sort=new" },
    { "url": "https://www.reddit.com/r/python/" }
  ],
  "query": "artificial intelligence",
  "sort": "relevance",
  "safeSearch": "off",
  "maxItems": 300,
  "maxRetries": 3,
  "proxyConfiguration": { "useApifyProxy": false }
}

Field	Type	Description
`urls`	array	Reddit URLs to scrape (search, subreddit, or subreddit search).
`query`	string	Keyword fallback used only when `urls` is empty.
`sort`	enum	`relevance` / `hot` / `top` / `new` / `comments`.
`safeSearch`	enum	`off` (include NSFW) or `on` (hide NSFW).
`maxItems`	integer	Hard cap on total posts across all URLs.
`maxRetries`	integer	Per-request retries before escalating proxy tier.
`proxyConfiguration`	object	Standard Apify proxy input. Defaults to no proxy.

📤 Output

Each dataset record matches the original reference shape exactly, plus a few top-level mirror fields so the table views work without nested-path lookups:

{
  "post": {
    "title": "The more young people use AI, the more they hate it",
    "url": "https://www.reddit.com/r/technology/comments/1szusu6/the_more_young_people_use_ai_the_more_they_hate_it/",
    "score": 22036,
    "comment_count": 1612
  },
  "subreddit": { "name": "technology" },
  "author":    { "name": "spherocytes" },
  "contentText": "",
  "content_type": "link",
  "created_timestamp": "2026-04-30T12:34:21.000000+0000",

  "title": "The more young people use AI, the more they hate it",
  "subreddit_name": "technology",
  "author_name": "spherocytes",
  "score": 22036,
  "comment_count": 1612,
  "url": "https://www.reddit.com/r/technology/comments/1szusu6/the_more_young_people_use_ai_the_more_they_hate_it/"
}

🚀 How to use the Actor (via Apify Console)

🔐 Log in at console.apify.com → Actors.
🔎 Find Reddit Search Scraper and open it.
📝 Paste one or more Reddit URLs (or type a keyword in the query field).
⚙️ Pick a sort (Relevance / Hot / Top / New / Most Comments) and set maxItems.
🛡️ Leave Proxy on default (no proxy) — the scraper auto-escalates if Reddit pushes back.
▶️ Click Start.
📊 Watch logs in real time; open the Output tab as records stream in.
📁 Export to JSON / CSV / Excel.

🛡️ Proxy strategy

The scraper uses a three-tier ladder (the archives can rate-limit a busy IP):

Tier	When it's used
🌐 Direct	Default — the archives usually serve fine without a proxy.
🏢 Datacenter	Auto-engaged if direct requests get 403 / 429 / rate-limited.
🏠 Residential	Auto-engaged if datacenter still fails. Retries then sticks for the rest of the run.

You can also start higher up the ladder by selecting a proxy group in the input.

📊 Sort & data-source notes

Source: PullPush handles global keyword search and subreddit/author search; Arctic Shift serves subreddit- and author-scoped queries as a fast fallback. Both are public Reddit archives.
Sort mapping — Reddit's sort intents map onto the archives' sort fields:
- 🎯 Relevance / ⭐ Top / 🔥 Hot → highest score first
- 🆕 New → newest created first
- 💬 Most Comments → highest comment count first
Coverage: archives index publicly posted content; very recent posts (last few minutes) or removed content may not appear. Pagination walks backward in time, so large maxItems runs are ordered newest-to-oldest within each time window.

💼 Best use cases

🤖 Building AI / LLM training datasets from Reddit discussion
📊 Brand monitoring & sentiment analysis
🧠 Market research and competitive intelligence
📝 Content trend discovery
🔬 Academic research on online communities

❓ Frequently asked questions

Q: Does it scrape comments? A: This actor returns post-level metadata (title, score, comment count, body text). For per-post comment threads, use an additional actor or extend this one to fetch <permalink>.json.

Q: Does it support private subreddits? A: No — only publicly accessible subreddits and search results.

Q: Do I need a Reddit account or API key? A: No. The actor reads public Reddit data archives, so there's nothing to register or authenticate.

Q: What happens if an archive rate-limits me? A: The scraper auto-escalates the proxy tier (direct → datacenter → residential) and retries. If every tier still fails, the run ends with a clear status message.

📨 Support and feedback

For issues, custom features, or feedback: dev.scraperengine@gmail.com

⚠️ Legal & ethical use

Only collect data from publicly accessible Reddit pages.
Respect Reddit's terms of service and applicable privacy laws (GDPR / CCPA).
The end user is responsible for downstream use of the data.

Reddit Scraper

janbruinier/jan-reddit-scraper

Scrape posts and comments from Reddit

Jan Bruinier

Reddit Scraper

scrapesmith/reddit-scraper

Scrape Reddit posts, comments, and user profiles without API keys or login. Extract from any subreddit, keyword search, or post URL. No rate limits.

Scrape Smith

Reddit Api Scraper

scrapio/reddit-api-scraper

Extract structured Reddit data with the Reddit API Scraper. Collect posts, comments, usernames, upvotes, subreddit names, and timestamps directly through the Reddit API. Ideal for market research, sentiment analysis, and community monitoring.

Scrapio

Reddit Scraper

gio21/reddit-scraper

Scrape Reddit posts and comments from any subreddit. Extract titles, scores, authors, comments, and more using Reddit's public JSON API.

Gio

5.0

Reddit Scraper

automation-lab/reddit-scraper

Working Reddit scraper for public Reddit search, subreddit listings, posts, comments, and user profiles. No Reddit account or API key required.

Stas Persiianenko

2.5K

4.7

Reddit Post & Subreddit Scraper

scrapeai/reddit-advanced-scraper

Collect structured Reddit data including posts, comments, author details, scores, awards, and timestamps using Reddit JSON endpoints.

ScrapeAI

5.0

Reddit Comment Scraper

crawlerbros/reddit-comment-scraper

Scrape Reddit Comments from a post on Reddit. Provides comment text, the parent of the thread, score and timestamps.

Crawler Bros

666

5.0

Reddit Comment Scraper — Post Comments & Subreddit Monitoring

automly/reddit-comment-scraper

Extract comments from specific Reddit posts or from the top posts of any subreddit. Supports all Reddit comment sort modes. Residential proxy required for reliable access.

Automly

Reddit Comments Scraper

quakerish_joyride/reddit-comments-scraper

Extract comments from any Reddit post or subreddit. Returns structured JSON with author, score, timestamp, and nested replies. Fast, no API key required.