Pricing

Pay per usage

Reddit Scraper Pro — Posts, Comments, Subreddits, No API Key

Reddit scraper via public JSON — posts + comments, no login. 20 fields/post (score, ratio, flair, NSFW). CSV/JSON. 101 runs · 6 users · u30d=2 · 27/30d. Trend research + LLM training data. blog.spinov.online · dev.to/0012303 · spinov001@gmail.com

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alex

Actor stats

Bookmarked

Total users

Monthly active users

a day ago

Last modified

Reddit Discussion Scraper — JSON API, no HTML parsing

Pulls posts (and optionally comment threads) from any public subreddit or Reddit search via Reddit's native JSON endpoint. No login, no Reddit OAuth app, no headless browser.

What you get per post (21 fields verified against `src/main.js`)

{
  "id": "1b2c3d4",
  "title": "What tools do you use for market research?",
  "author": "startup_founder",
  "subreddit": "Entrepreneur",
  "score": 847,
  "upvoteRatio": 0.94,
  "numComments": 234,
  "createdUtc": "2026-03-17T15:30:00.000Z",
  "url": "https://reddit.com/r/Entrepreneur/comments/...",
  "selfText": "I've been looking for affordable tools.",
  "linkUrl": "https://example.com/article",
  "isVideo": false,
  "thumbnail": "https://...",
  "flair": "Discussion",
  "awards": 3,
  "isNSFW": false,
  "isStickied": false,
  "domain": "self.Entrepreneur",
  "source": "r/Entrepreneur",
  "scrapedAt": "2026-04-29T16:32:00.000Z",
  "comments": [
    { "id": "abc123", "author": "data_analyst", "body": "I use a combination of...", "score": 156, "createdUtc": "2026-03-17T16:00:00.000Z", "depth": 0 }
  ]
}

Comment objects carry 6 fields: id, author, body, score, createdUtc, depth.

Input parameters (verified against `.actor/input_schema.json`)

Parameter	Type	Default	Range	Description
`subreddits`	Array	`[]`	—	Subreddit names; the `r/` prefix is optional and stripped
`searchQueries`	Array	`[]`	—	Search terms across all of Reddit
`maxPostsPerSource`	Integer	`50`	1–500	Cap per subreddit / per search query — multi-source runs multiply
`includeComments`	Boolean	`true`	—	Fetch comment threads for each post
`maxCommentsPerPost`	Integer	`20`	1–100	Global cap on comments per post (see depth-first note below)
`sortBy`	String	`"hot"`	hot/new/top/rising	For search queries: `"hot"` is silently rewritten to `"relevance"`
`timeFilter`	String	`"week"`	hour/day/week/month/year/all	Applies primarily to `top` sort

If no subreddits and no searchQueries are provided, the actor falls back to scraping r/technology so the run does not error on empty input.

Use cases

Market research — what people say about your product, brand, or industry
Sentiment analysis — collect posts and comments for NLP / LLM pipelines
Trend monitoring — track emerging topics across your target subreddits
Competitive intelligence — monitor competitor mentions and complaints
Content research — find top questions and topics your audience cares about
Lead generation — identify users asking for your kind of product/service

How it works

Fetches https://old.reddit.com/r/<subreddit>/<sortBy>.json and https://old.reddit.com/search.json directly. The old.reddit.com host is less aggressive about IP blocking than www.reddit.com and exposes the same JSON shape.
Uses Apify Residential Proxy (US) when available; falls back to default Apify proxy; falls back to direct fetch in local development.
Random 2–5 second delay before every request (rate-limit hygiene).
Rotates through 4 desktop User-Agent strings per request (Chrome Win/Mac/Linux + Firefox Win).
Cursor-based pagination via Reddit's after parameter.
CheerioCrawler with maxConcurrency: 1, maxRequestRetries: 3, crawler-level cap maxRequestsPerCrawl: 500 (request budget — listing requests + comment requests share this pool).

Honest limitations

Comment cap is depth-first, not breadth-first. extractComments recurses into the first reply chain before walking siblings. With maxCommentsPerPost: 20 and a deep first thread, you may get 20 nested replies from a single root comment and zero from the other roots. To sample more roots, raise the cap.
Comments are fetched with sort=best only. No knob to switch to top/new/controversial for comment ordering — only the post listing sort is configurable.
sortBy: "hot" for search queries silently becomes "relevance". Reddit's /search.json does not honor hot, so the actor rewrites it. If you set top for search, top is preserved.
maxRequestsPerCrawl: 500 is a request budget, not a post budget. Each post-with-comments costs 2 requests (1 listing page returns up to 100 posts, plus 1 request per post for comments). 100 posts with comments = 1 + 100 = 101 requests. Multi-source runs share this 500-request pool — budget accordingly.
maxPostsPerSource is per source. 5 subreddits × 50 = up to 250 posts per run.
Apify Free plan = datacenter proxy only. Some subreddits return 403 from datacenter IPs. The actor falls back gracefully but yield drops. For reliable runs, use a paid Apify plan with residential access, or run locally.
old.reddit.com is a long-lived but not-officially-supported subdomain. Reddit could deprecate it; if that happens, this actor would need a host swap to www.reddit.com and likely OAuth.
No JSON Listing schema validation. Malformed responses (HTML 502 instead of JSON, anti-bot challenge) are caught and skipped silently with a log.warning. A run with all warnings looks like a successful zero-record run.
thumbnail is normalized to null when Reddit returns sentinel values "self" or "default".
selfText, linkUrl, flair, domain can be null for posts without those fields.
Reddit rate limits are unpublished. The 2–5s delay + concurrency=1 makes 429s rare in practice but not impossible — persistent 429 returns silent zero-record after the 3 internal retries.

Quick start

Click Try for free above.
Add subreddit names (without r/) to subreddits or search terms to searchQueries.
Set maxPostsPerSource (default 50, max 500).
Run. Download JSON or CSV from Storage → Dataset.

Programmatic example:

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("knotless_cadence/reddit-discussion-scraper").call(
    run_input={
        "subreddits": ["startups", "SaaS", "Entrepreneur"],
        "maxPostsPerSource": 50,
        "sortBy": "top",
        "timeFilter": "month",
        "includeComments": True,
        "maxCommentsPerPost": 30,
    }
)
for post in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(post["score"], post["title"])

FAQ

Why the JSON API instead of HTML scraping? HTML scrapers break every time Reddit updates their design. The .json endpoints return structured data in a format that has been stable for years — the same shape used by Reddit's own apps and third-party clients.

Can I scrape private subreddits? No. Only publicly accessible subreddits and public Reddit search results.

Does it need Reddit credentials? No. All requests go to public endpoints.

How many posts per run? Up to maxPostsPerSource per subreddit / per search query, capped at 500 by input_schema. Whole-run total = len(sources) × maxPostsPerSource, bounded by the 500-request crawler budget.

Proof of delivery: 31 published Apify actors (78 total in portfolio). The flagship Trustpilot scraper has 951 lifetime production runs; this Reddit scraper has 82+ runs. One paid 3-article series shipped in March 2026 ($150, proxy industry). Pilot pricing locked through May 2026.

Sample request? Reply sample to spinov001@gmail.com and we'll send 2 published case-study articles within 24 hours.

Need a custom Reddit variant?

Common asks delivered for paying clients:

Sentiment dashboard — daily sentiment scoring on a list of subreddits, fed into Looker/Metabase
Keyword alerts — webhook fires the moment a brand/term appears in target subreddits
Competitor tracking — pull all mentions of competitor names, summarize weekly
Comment-thread expansion — recursive comment graph for any post, exportable as edge list
Cross-source merge — Reddit + HackerNews + Bluesky into one normalized feed

Tier	Price	Includes
Pilot	$97	1 custom actor or modification, 7-day support
Standard	$297	Custom actor + Slack/email alerts on results, 30-day support
Premium	$797	Custom actor + dashboard + 90-day support + 1 modification round

Email: spinov001@gmail.com Blog (case studies + writeups): https://blog.spinov.online Telegram channel (scraping & data engineering tips): https://t.me/scraping_ai

Pilot pricing while we grow our public portfolio. Most pilots delivered inside 48–72 hours.

Honest disclosure

Public Reddit data only. We do not scrape private user data, accounts, deleted content from caches, or anything behind a login.
Independent project — not affiliated with Reddit, Inc.
This actor is maintained by the same author who runs apify.com/knotless_cadence (78 actors, 31 public).

Google News Scraper — Fast Headlines & Sources [No API Key]

knotless_cadence/google-news-scraper

Monitor Google News fast. No API, no RSS limits, no blocks. Titles, dates, snippets, sources → CSV. 73 lifetime runs · 50/50 success 30d (100%) · u7d=1 fresh · 8 paying users. dev.to/0012303 (Proxy-Seller, 2320w paid). blog.spinov.online · t.me/scraping_ai · spinov001@gmail.com

Alex

Hacker News — CSV, Stories + Comments + Users, No API Key

knotless_cadence/hacker-news-scraper

Scrape HN top/new/Show HN/Ask HN/jobs in minutes. No rate limits. Title, URL, score, comments, author as JSON/CSV. 26+ runs. For launch monitoring, competitor tracking, market trends. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Meta Threads Scraper — CSV, No Login, No Rate Limits

knotless_cadence/threads-scraper

Meta Threads (threads.net) JSON/CSV — POSTs (author, text, source) + PROFILEs (followers, bio, avatar) by username/search. 46 runs / 8 users / 32-actor portfolio (2190 lifetime). Audience research + brand mentions. Sample: dev.to/0012303. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Trustpilot Review Scraper — Unlimited Reviews, Bypass 200 Limit

knotless_cadence/trustpilot-review-scraper

Trustpilot reviews → CSV/JSON/Excel in 2min. 970 runs · 832/30d · 100% success · bypasses 200-review cap. 9 fields (stars,text,author,date,lang,co,URL,headline). BI, competitor research, lead enrichment. dev.to/0012303 · blog.spinov.online · spinov001@gmail.com · t.me/scraping_ai

Alex

Bluesky Scraper — Posts, Followers & Profiles [No API Limits]

knotless_cadence/bluesky-scraper

Bluesky posts, profiles & feeds in CSV in 2 min — no API waitlist, no rate limits, no bans. 41 runs · 100% ok past 30d (30/30). Text, images, likes, reposts, profile metadata. Post-Twitter audience tracking + creator discovery + brand listening. dev.to/0012303 · blog.spinov.online · t.me/scraping_ai

Alex

MCP Trend Detector — Market Trend Signals, JSON, No API Key

knotless_cadence/mcp-trend-detector

20 runs. Trending topics across Reddit, HN, Google News in real time. MCP-native for Claude/ChatGPT agents — surface stories before competitors. Backed by 951-run Trustpilot flagship + 31-actor portfolio. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Glassdoor Scraper — Reviews, Salaries, CSV, No Login Required

knotless_cadence/glassdoor-reviews-scraper

Glassdoor reviews + salary in CSV/JSON in 5 min — no coding, no login, no rate-limits. 50 lifetime runs · 5 paying users. Schema: ratings, pros/cons, titles, dates, salary. For competitive intel + recruiter outreach + comp planning. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

MCP Company Researcher — AI Agent Business Intel, JSON, No Key

knotless_cadence/mcp-company-researcher

22+ runs. Get company intel as JSON in 30 sec — feed a domain, get back website meta + tech-stack markers (7) + DNS + SSL + Google News + HN mentions. No login. For SDR enrichment + ABM targeting + investor due-diligence. Custom MCP — spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Booking.com Scraper — Hotels, Prices, Ratings, CSV, No API Key

knotless_cadence/booking-com-scraper

Booking.com hotels JSON/CSV — 16 fields: name, URL, price, currency, rating, stars, reviewCount, location, distance, image, dates, adults, rooms. Filters, no key. 21+ runs. For travel-tech + price-comp + market benchmarking. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Social Profiles — Bio, Followers, Posts in CSV, Bulk

knotless_cadence/social-profile-scraper

Social profile data CSV/JSON — username, bio, followers, following, posts. Same schema LinkedIn/GitHub/Reddit. 49 lifetime runs · 9 users · 5 active 30d · 100% success (44/44). B2B prospecting/ABM/recruiter sourcing. dev.to/0012303 · blog.spinov.online · t.me/scraping_ai

Alex