Reddit API Scraper
Pricing
$19.99/month + usage
Reddit API Scraper
Reddit API Scraper collects data from Reddit posts, comments, and subreddits using API-based extraction. Gather post titles, text, usernames, scores, timestamps, and engagement metrics to analyze trends, monitor discussions, or build datasets for research, marketing, and insights. 📊💬
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer

SimpleAPI
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Reddit Scraper is an Apify actor that extracts data from Reddit by keyword search. It uses Reddit’s public search API and returns posts in a structured format. No login is required. You can use it as a Reddit scraper or alternative to the Reddit API for keyword-based search.
Why Choose Us?
- No proxy by default – Sends requests directly to Reddit; uses proxy only when blocked.
- Automatic proxy fallback – If Reddit blocks the request, the actor falls back to datacenter proxy, then to residential proxy (with retries), and sticks with residential for the rest of the run.
- Bulk keywords – Search multiple keywords in one run.
- Same output shape – Output is a single JSON object: keys = keywords, values = arrays of posts (same structure as the reference
output.json).
Key Features
| Feature | Description |
|---|---|
| Search by keyword | One or more search terms (bulk input). |
| Multiple strategies | Uses several sort strategies (new, relevance, hot, top, etc.) to maximize results. |
| Rate limiting | Delays and semaphores to reduce blocking. |
| Retries | Up to 3 retries with exponential backoff; special handling for 403. |
| Proxy fallback | No proxy → datacenter → residential, with clear logging. |
| Structured output | Each post includes metaData.keyword, id, subreddit, title, author, permalink, url, selftext, and other Reddit fields. |
Input
Configure the actor with these inputs (Form or JSON in Apify Console).
| Field | Type | Required | Description |
|---|---|---|---|
| Search keywords | array (stringList) | Yes | Keywords to search on Reddit (e.g. webscraping, python). Supports bulk edit. |
| Subreddit names | array (stringList) | No | Optional subreddits to limit search. |
| Results limit per keyword and subreddit | integer | No | Max posts per keyword (default: 5, max: 1000). |
| Sorting | string | No | Sort order: new, hot, top, relevance (default: new). |
| Proxy Configuration | object (proxy) | No | By default no proxy. Enable Apify Proxy if you want to force proxy from the start. Fallback (datacenter → residential) runs when Reddit blocks. |
Example input (JSON)
{"searchKeywords": ["webscraping", "python"],"subredditNames": [],"resultsLimitPerKeyword": 5,"sorting": "new","proxyConfiguration": { "useApifyProxy": false }}
Output
The dataset contains one item: a JSON object where each key is a keyword and each value is an array of post objects. Same structure as the reference output.json.
Example output structure
{"webscraping": [{"metaData": { "keyword": "webscraping" },"id": "abc123","subreddit": "Python","selftext": "...","author_fullname": "t2_xxx","title": "Post title","subreddit_name_prefixed": "r/Python","name": "t3_abc123","link_flair_text_color": "dark","subreddit_type": "public","thumbnail": "self","link_flair_type": "text","author_flair_type": "text","domain": "self.Python","selftext_html": "...","subreddit_id": "t5_xxx","author": "username","permalink": "/r/Python/comments/...","url": "https://www.reddit.com/..."}],"python": [ ... ]}
| Field | Description |
|---|---|
metaData.keyword | Search keyword for this post. |
id | Reddit post ID. |
subreddit | Subreddit name. |
title | Post title. |
author | Author username. |
permalink | Relative link to the post. |
url | Full URL. |
selftext | Post body text. |
How to Use the Actor (via Apify Console)
- Log in at https://console.apify.com and go to Actors.
- Find Reddit API Scraper (or
reddit-api-scraper) and open it. - Open the Input tab (Form or JSON).
- Enter Search keywords (e.g.
webscraping; add more with + Add or Bulk edit). - Optionally set Results limit per keyword, Sorting, and Proxy Configuration.
- Click Start.
- Watch Log for progress and proxy fallback messages.
- Open the Output tab to see the dataset (one item = object of keywords → posts).
- Export to JSON or use via API.
Best Use Cases
- Monitoring Reddit for keywords (brand, product, topic).
- Research or sentiment on public discussions.
- Building datasets of Reddit posts by topic.
- Alternative to Reddit API for simple search-based scraping.
Frequently Asked Questions
Do I need a Reddit API key?
No. The actor uses Reddit’s public search endpoint; no authentication is required.
Why did it switch to proxy?
If you see “Falling back to datacenter/residential proxy” in the log, Reddit returned 403 (block). The actor then uses Apify proxies and continues; once it switches to residential, it stays on residential for the rest of the run.
Can I scrape private subreddits?
No. Only publicly available content is accessible.
Support and Feedback
Use the Apify actor’s Issues or Reviews for bugs and feature requests.
Cautions
- Data is collected only from publicly available Reddit content.
- No private accounts or password-protected content are accessed.
- You are responsible for compliance with applicable laws (e.g. privacy, data protection, spam).