Reddit Scraper avatar

Reddit Scraper

Pricing

Pay per event

Go to Apify Store
Reddit Scraper

Reddit Scraper

Scrape public Reddit posts and comments from subreddit, search, user, and thread RSS feeds without Reddit API credentials.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Hanna Nosova

Hanna Nosova

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Collect public Reddit posts and optional comments without Reddit API keys, OAuth, cookies, or a Reddit login.

What does Reddit Scraper do?

Reddit Scraper turns public Reddit pages into clean dataset records that you can export, analyze, or connect to other tools.

It can collect:

  • ๐Ÿ”Ž Posts from subreddit listings
  • ๐Ÿ†• New, top, rising, hot, and relevance-sorted results
  • ๐ŸŒ Global Reddit search results
  • ๐ŸŽฏ Search results inside one subreddit
  • ๐Ÿ‘ค Public submitted posts from user profiles
  • ๐Ÿ’ฌ Comments from public posts when enabled
  • ๐Ÿงต Focused comment context from supported Reddit comment URLs

Who is it for?

This Actor is useful for teams that need lightweight public Reddit monitoring.

  • ๐Ÿง‘โ€๐Ÿ’ผ Market researchers tracking conversations around products
  • ๐Ÿง‘โ€๐Ÿ”ฌ Data analysts building public discussion datasets
  • ๐Ÿง‘โ€๐Ÿ’ป Developers prototyping Reddit-powered workflows
  • ๐Ÿง‘โ€๐Ÿซ Academic researchers collecting public post metadata
  • ๐Ÿง‘โ€๐Ÿš€ Growth teams monitoring subreddit mentions
  • ๐Ÿง‘โ€โš–๏ธ Trust and safety teams checking public discussion trends

Why use this Actor?

Reddit pages can be inconsistent across communities, post ages, and access conditions. This Actor is designed to return useful public records while avoiding guessed or fabricated values.

If Reddit does not make a field available for a public page, the Actor leaves that field empty or null instead of inventing data.

Common use cases

  • Monitor a subreddit for new posts about a brand, product, or competitor
  • Build keyword-based datasets from public Reddit discussions
  • Collect post URLs, titles, authors, timestamps, and comment counts for analysis
  • Pull a small set of comments for thread review or qualitative research
  • Feed public Reddit discussions into dashboards, alerts, or LLM workflows

Supported Reddit sources

You can start runs from several customer-friendly input types:

  • Subreddit URLs such as https://www.reddit.com/r/technology/
  • Reddit post URLs such as https://www.reddit.com/r/technology/comments/.../
  • Reddit comment URLs when you need nearby context
  • Reddit user profile URLs for public submitted posts
  • A global searchQuery
  • A searchQuery restricted to a searchSubreddit

What data can you collect?

FieldDescription
typepost or comment
idReddit item id when available
thingIdReddit thing id such as t3_abc or t1_xyz
subredditSubreddit name
titlePost title
textPost text or summary when available
selfTextPost text field for compatibility with Reddit-style datasets
bodyComment body text
authorPublic Reddit username
urlPost URL
permalinkReddit permalink
createdAtPublic timestamp when available
scoreScore when available; otherwise null
commentCountComment count when available
parentIdParent post id for comments when inferable
depthComment nesting depth when available
sourceUrlInput or resolved Reddit source URL
scrapedAtTimestamp when the record was saved

How much does it cost to scrape Reddit posts?

The Actor uses pay-per-event pricing:

  • Start event: $0.005 one-time run charge
  • Item event: charged per saved post or comment record
Plan tierApprox. item event priceExample: 100 posts/comments
Free$0.0001329$0.0133 + start event
Bronze$0.00011557$0.0116 + start event
Silver$0.000090141$0.0090 + start event
Gold$0.000069339$0.0069 + start event
Platinum$0.000046226$0.0046 + start event
Diamond$0.000032358$0.0032 + start event

A small Free-plan test that saves 25 posts costs about $0.0083 total ($0.005 start event + 25 ร— $0.0001329). A 100-record Free-plan run is about $0.0183 before any Apify platform usage charges shown to you at run time.

Actual price is shown on the Apify Store page before you start a run. Keep maxPostsPerSource and maxCommentsPerPost low for test runs.

How to scrape a subreddit

  1. Open the Actor input.
  2. Add a subreddit URL such as https://www.reddit.com/r/technology/.
  3. Choose sort and timeFilter.
  4. Set maxPostsPerSource.
  5. Run the Actor.
  6. Export the dataset as JSON, CSV, Excel, or via API.

Use searchQuery for global search.

Example:

{
"searchQuery": "web scraping tools",
"sort": "relevance",
"timeFilter": "month",
"maxPostsPerSource": 25
}

How to search inside a subreddit

Set both searchQuery and searchSubreddit.

{
"searchQuery": "browser automation",
"searchSubreddit": "webscraping",
"sort": "new",
"maxPostsPerSource": 20
}

How to scrape comments

Enable comments only when you need thread data. Comment scraping can add more requests and may take longer than post-only runs.

{
"urls": [{ "url": "https://www.reddit.com/r/technology/" }],
"includeComments": true,
"maxPostsPerSource": 5,
"maxCommentsPerPost": 10
}

Input options

  • urls - Reddit URLs to scrape.
  • searchQuery - optional Reddit search query.
  • searchSubreddit - optional subreddit restriction.
  • sort - hot, new, top, rising, or relevance.
  • timeFilter - hour, day, week, month, year, or all.
  • maxPostsPerSource - maximum post records per source.
  • includeComments - fetch comments for post records.
  • maxCommentsPerPost - maximum comments for each post.
  • commentContextMode - focused or post comment behavior.
  • commentContextDepth - amount of surrounding context for focused comment URLs.
  • proxyConfiguration - Apify proxy settings. Residential proxy is recommended for reliability.

Output example

{
"type": "post",
"id": "abc123",
"thingId": "t3_abc123",
"subreddit": "technology",
"title": "Example Reddit post title",
"text": "Example public post text or summary",
"selfText": "Example public post text or summary",
"body": null,
"author": "example",
"url": "https://www.reddit.com/r/technology/comments/abc123/example/",
"permalink": "https://www.reddit.com/r/technology/comments/abc123/example/",
"createdAt": "2026-06-13T12:00:00+00:00",
"score": null,
"commentCount": 42,
"parentId": null,
"depth": null,
"sourceUrl": "https://www.reddit.com/r/technology/",
"scrapedAt": "2026-06-13T12:01:00.000Z"
}

Tips for best results

  • Start with small limits while testing.
  • Use subreddit URLs for predictable runs.
  • Use new sort for monitoring recent conversations.
  • Use top with a timeFilter for popular historical posts.
  • Enable comments only for focused collection.
  • Use the default residential proxy setting for reliable Reddit access.
  • For large research jobs, split sources into smaller runs so you can inspect partial results.

Limitations

Reddit does not make every field available on every public page.

The Actor may return null for:

  • Score
  • Exact nested comment depth
  • Some deleted or removed text
  • Fields that are hidden, restricted, or unavailable for a given public page

This is intentional. The Actor does not invent unavailable data.

Availability notes

Results can vary because Reddit controls what is visible for each public URL. Some communities, posts, or comments may be removed, locked, private, quarantined, age-gated, rate-limited, or temporarily unavailable.

For production workflows, use modest limits and enable Apify Residential Proxy in the Actor input.

Proxy and anti-blocking notes

The default input uses Apify Residential Proxy because Reddit may rate-limit shared datacenter or local traffic. Residential proxy settings are more reliable for production runs.

Use the smallest practical limits when testing proxy settings.

Integrations

You can connect the dataset to:

  • Google Sheets for subreddit monitoring
  • Slack alerts for keyword mentions
  • Data warehouses for trend analysis
  • BI dashboards for content tracking
  • LLM workflows for public discussion summaries

API usage with Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('fetch_cat/reddit-scraper').call({
urls: [{ url: 'https://www.reddit.com/r/technology/' }],
maxPostsPerSource: 10
});
console.log(run.defaultDatasetId);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('MY-APIFY-TOKEN')
run = client.actor('fetch_cat/reddit-scraper').call(run_input={
'searchQuery': 'web scraping tools',
'maxPostsPerSource': 10,
})
print(run['defaultDatasetId'])

API usage with cURL

curl "https://api.apify.com/v2/acts/fetch_cat~reddit-scraper/runs?token=$APIFY_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"urls":[{"url":"https://www.reddit.com/r/technology/"}],"maxPostsPerSource":10}'

MCP usage

Use the Apify MCP server with tools scoped to this Actor:

https://mcp.apify.com/?tools=fetch_cat/reddit-scraper

Claude Code setup:

$claude mcp add apify-reddit --transport http --url "https://mcp.apify.com/?tools=fetch_cat/reddit-scraper"

JSON configuration example:

{
"mcpServers": {
"apify-reddit": {
"url": "https://mcp.apify.com/?tools=fetch_cat/reddit-scraper"
}
}
}

Example prompts:

  • "Scrape the top posts from r/technology this week."
  • "Find recent Reddit discussions about web scraping tools."
  • "Collect five posts and comments from this Reddit thread."

Data quality guidance

Review saved text before using it for downstream modeling. Public Reddit pages can contain boilerplate, deleted-content markers, link summaries, or moderator notices.

For analytics, filter by type, subreddit, author, and createdAt.

Legality and ethics

This Actor collects public Reddit pages available without login. You are responsible for using the data lawfully, respecting privacy, following applicable terms, and avoiding spam or harassment.

Do not use scraped public data to identify, target, or harm individuals.

FAQ

Does this Actor need Reddit API credentials?

No. It collects public Reddit pages and does not require OAuth, cookies, or a Reddit login.

Can it scrape private or quarantined communities?

No. It only collects public content that is available without login.

Can it collect every comment in a large thread?

Not always. Very large, removed, locked, or restricted threads may return fewer comments than requested.

Troubleshooting

Why is score null?

Reddit does not consistently make score values available for every public source. The Actor leaves score null rather than guessing.

Why did I get fewer posts than requested?

The source may contain fewer visible entries for that source, sort, search query, or time filter. Reddit may also remove, hide, or temporarily limit posts.

Why are comments missing?

Set includeComments to true and use a post URL or a low number of posts. Some threads have deleted, locked, or unavailable comments.

Other Apify actors from this account may help with social media or monitoring workflows:

  • Website Change Monitor: https://apify.com/fetch_cat/website-change-monitor
  • TikTok Comments Scraper: https://apify.com/fetch_cat/tiktok-comments-scraper

Changelog

0.1

  • Initial Reddit post and comment collection.

Support

If a Reddit URL does not work, include the input, run ID, and expected public page. Availability can vary by subreddit, post age, access conditions, and Reddit-side restrictions.