Reddit Trends Scraper avatar

Reddit Trends Scraper

Pricing

from $3.99 / 1,000 results

Go to Apify Store
Reddit Trends Scraper

Reddit Trends Scraper

Pricing

from $3.99 / 1,000 results

Rating

0.0

(0)

Developer

ScraperX

ScraperX

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

An Apify Actor for scraping Reddit posts, comments, and trends with intelligent proxy fallback.

Features

  • Flexible Input: Supports Reddit URLs, usernames, subreddits, or keywords
  • Bulk Processing: Process multiple URLs/keywords in a single run
  • Smart Proxy Fallback: Automatically falls back through proxy types:
    1. No proxy (direct connection)
    2. Datacenter proxy (if blocked)
    3. Residential proxy with 3 retries (if datacenter fails)
  • Detailed Logging: Comprehensive logs to track scraping progress
  • Pagination Support: Automatically handles pagination to collect multiple pages
  • Sort Options: Support for hot, new, top, and rising sort orders

Input Configuration

  • startUrls: Array of Reddit URLs, usernames (e.g., username or u/username), subreddits (e.g., r/popular), or keywords
  • sortOrder: Sort order for posts (hot, new, top, rising) - default: hot
  • maxPosts: Maximum number of posts to scrape - default: 500
  • maxComments: Maximum number of comments per post - default: 0 (not implemented yet)
  • proxyConfiguration: Apify proxy configuration (optional)

Output

The actor outputs structured data to the Apify dataset with the following fields:

  • title: Post title
  • postUrl: Full URL to the post
  • upvotes: Number of upvotes
  • comments: Number of comments
  • subreddit: Subreddit name (e.g., r/popular)
  • subredditUrl: URL to the subreddit
  • subredditType: Type of subreddit (usually link)
  • author: Post author username
  • authorProfile: URL to author profile
  • postTime: Post timestamp in YYYY-MM-DD HH:MM:SS format

Proxy Fallback Logic

The actor implements intelligent proxy fallback:

  1. No Proxy: Starts with direct connection
  2. Datacenter Proxy: If blocked, automatically switches to datacenter proxy
  3. Residential Proxy: If datacenter fails, switches to residential proxy with 3 retries
  4. Sticky Proxy: Once a proxy type works, it sticks with it for all subsequent requests

All proxy events are logged clearly for monitoring.

Usage

  1. Configure input in Apify platform
  2. Add your Reddit URLs, usernames, or keywords
  3. Set sort order and maximum posts
  4. Optionally configure proxy settings
  5. Run the actor

Development

# Install dependencies
pip install -r requirements.txt
# Run locally
python -m src

Notes

  • The actor uses old.reddit.com for scraping but outputs URLs with reddit.com
  • Rate limiting is implemented to be respectful to Reddit's servers
  • All errors are logged and the actor continues processing remaining URLs