Reddit Post Scraper avatar
Reddit Post Scraper

Pricing

Pay per usage

Go to Apify Store
Reddit Post Scraper

Reddit Post Scraper

Extract detailed information from Reddit posts including title, content, media, and hierarchical comment threads. Supports various comment sorting methods and customizable limits.

Pricing

Pay per usage

Rating

5.0

(2)

Developer

D

D

Maintained by Community

Actor stats

2

Bookmarked

6

Total users

2

Monthly active users

11 days ago

Last modified

Categories

Share

πŸ“„ Reddit Post Scraper

πŸ†“ 100% FREE | πŸ’¬ Full Comment Threads | πŸ”’ No API Key Required

A detailed scraper for collecting information about specific Reddit posts and their comments with support for hierarchical reply structure.


✨ Why Choose This Scraper?

  • βœ… Completely Free - No hidden costs or subscriptions
  • βœ… No Reddit API Key Needed - Works out of the box
  • βœ… Full Comment Trees - Preserves reply hierarchy
  • βœ… Residential Proxies Included - Avoid rate limits
  • βœ… Rich Media Support - Images, videos, and links

πŸš€ Key Features

  • Complete Post Information: Title, author, text, media, statistics
  • Comment Hierarchy: Preserves reply structure and nested comments
  • Flexible Sorting: Various comment sorting methods
  • Limits: Control the number of comments to collect
  • Metadata: Upvotes, timestamps, authors

πŸ“‹ Input Parameters

{
"startUrls": [
"https://www.reddit.com/r/Python/comments/15t8eq0/guido_van_rossum_on_why_python_uses_0based/"
],
"maxComments": 100,
"skipComments": false,
"sort": "confidence",
"includeNSFW": true,
"debugMode": false
}

Parameters:

  • startUrls (array, required): List of post URLs to scrape
  • maxComments (integer): Maximum number of comments to collect
  • skipComments (boolean): Skip comment collection (post only)
  • sort (string): Comment sorting - confidence, new, top, old, qa
  • includeNSFW (boolean): Process NSFW content
  • debugMode (boolean): Detailed logging

πŸ“€ Output Data

Post Structure:

{
"dataType": "post",
"id": "t3_15t8eq0",
"url": "https://www.reddit.com/r/Python/comments/...",
"title": "Post title",
"subreddit": "Python",
"author": "username",
"body": "Post text content",
"body_html": "<div>HTML content</div>",
"media": {
"type": "image",
"url": "https://..."
},
"stats": {
"upvotes": 1234,
"upvote_ratio": 0.95,
"comments_total": 56,
"comments_scraped": 50
},
"created_utc": "1234567890",
"scrapedAt": "2025-01-20T10:30:00.000Z",
"comments": [...]
}

Comment Structure:

{
"dataType": "comment",
"id": "t1_abc123",
"author": "username",
"body": "Comment text",
"upvotes": "42",
"created_utc": "1234567890",
"scrapedAt": "2025-01-20T10:30:00.000Z",
"replies": [...]
}

πŸ’‘ Usage Examples

Collect post with top 100 comments

{
"startUrls": ["https://www.reddit.com/r/AskReddit/comments/xyz/..."],
"maxComments": 100,
"sort": "top"
}

Post information only (no comments)

{
"startUrls": ["https://www.reddit.com/r/news/comments/..."],
"skipComments": true
}

New comments with limit

{
"startUrls": ["https://www.reddit.com/r/technology/comments/..."],
"maxComments": 50,
"sort": "new"
}

βš™οΈ Technical Details

  • Memory: 256 MB (optimized for efficiency)
  • Proxy: Residential (automatic, included free)
  • Timeout: 3600 seconds
  • Source: old.reddit.com
  • Structure: Hierarchical comment tree

πŸ“ Notes

  • Comments preserve full hierarchy (replies)
  • Support for various media types (images, videos, links)
  • Automatic handling of deleted comments
  • All timestamps in UTC format

🎯 Perfect For

  • πŸ’¬ Sentiment analysis and opinion mining
  • πŸ“Š Discussion thread analysis
  • πŸ”¬ Social research and studies
  • πŸ€– Training chatbots and AI models
  • πŸ“° Content moderation and monitoring

πŸ’¬ Questions or Issues? Feel free to reach out or check the documentation!