Reddit Post Media Downloader and Metadata Scraper⚡ avatar

Reddit Post Media Downloader and Metadata Scraper⚡

Pricing

from $2.50 / 1,000 post scrapeds

Go to Apify Store
Reddit Post Media Downloader and Metadata Scraper⚡

Reddit Post Media Downloader and Metadata Scraper⚡

📥 Extract full metadata + media download URLs from any Reddit post — videos, images, galleries, external embeds, promoted ads & more. 🎬 Video MP4 with separate audio streams, 🖼️ high-res images, 📸 full carousel/gallery sets, 📢 promoted ad data, complete post details and comments

Pricing

from $2.50 / 1,000 post scrapeds

Rating

0.0

(0)

Developer

APIHarvest

APIHarvest

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Reddit Post Media Downloader and Metadata Scraper

The Reddit Post Media Downloader and Metadata Scraper is the most comprehensive Apify actor for extracting complete metadata, media download links, and optional comments from any Reddit post. Whether you need to scrape Reddit post data, download Reddit videos, download Reddit images, or extract Reddit gallery content — the Reddit Post Media Downloader and Metadata Scraper handles every post type automatically.


What Is the Reddit Post Media Downloader and Metadata Scraper?

The Reddit Post Media Downloader and Metadata Scraper is a production-grade Apify actor that takes one or more individual Reddit post URLs and returns:

  • Complete post metadata — title, author, score, comments count, flair, timestamps, and 60+ data fields
  • Direct media download links — video (MP4), audio (separate DASH stream), images (full resolution + all sizes), gallery items
  • Optional post comments — toggle ON/OFF via a checkbox in the actor input
  • External embed data — YouTube, Vimeo, and other embedded media with provider info, thumbnails, and embed HTML
  • Post type auto-classification — every post is tagged as video, image, gallery, text, link, rich_video, or ad
  • Promoted post metadata — call-to-action, destination URL, ad-specific fields

The Reddit Post Media Downloader and Metadata Scraper processes each URL through multiple parallel extraction channels simultaneously, ensuring maximum reliability and data completeness.


Key Features of the Reddit Post Media Downloader and Metadata Scraper

  • All post types supported — video, image, gallery/carousel, text, external link, rich video, promoted/ad posts
  • Individual post URLs only — paste any Reddit post URL (subreddit or profile) and get structured data back
  • Comments toggle — include or exclude comments from output with a single checkbox
  • Video download data — fallback MP4, DASH manifest, HLS manifest, separate audio stream, all quality formats
  • Gallery extraction — every image/video/GIF in a carousel extracted as a separate media item
  • External embed support — YouTube, Vimeo, and other oembed providers with full metadata
  • Residential US proxy — automatic proxy rotation for reliable access
  • No subreddit/profile scraping — this actor processes individual post URLs only, not entire subreddits or profiles

What This Actor Does NOT Do

The Reddit Post Media Downloader and Metadata Scraper does not:

  • Scrape entire subreddits (e.g., all posts from r/python)
  • Scrape entire user profiles (e.g., all posts from u/username)
  • Search Reddit by keyword
  • Download the actual media files — it provides direct download URLs you can use

It only processes individual post URLs that you provide.


Supported Post Types

The Reddit Post Media Downloader and Metadata Scraper automatically detects and handles:

Post TypeWhat You Get
Video PostsDirect MP4 download URL, separate audio stream, DASH/HLS manifests, all quality formats, dimensions, duration
Image PostsFull-resolution image URL, all resolution variants, dimensions
Gallery/Carousel PostsIndividual download link for every image/video/GIF, per-item dimensions, captions, media type
Text PostsFull selftext (plain + HTML), complete post metadata
External Link PostsExternal URL, oembed data (YouTube/Vimeo), embed HTML, thumbnail, provider info
Rich Video PostsExternal video embed URL, provider name, iframe HTML, oembed metadata
Promoted/Ad PostsCall-to-action text, destination URL, outbound link data, ad classification
Mixed CarouselsEach gallery item separately with correct media type (image/gif/video)

How to Use the Reddit Post Media Downloader and Metadata Scraper

Input

Provide one or more Reddit post URLs:

https://www.reddit.com/r/subreddit/comments/postid/title/
https://www.reddit.com/user/username/comments/postid/title/
https://v.redd.it/videoid

Input Parameters

ParameterTypeDefaultDescription
post_urlsArrayrequiredOne or more Reddit post URLs to process
include_commentsBooleanfalseToggle ON to include comments in output, OFF to exclude
max_resultsInteger0Maximum items per bucket (posts, media, comments). 0 = unlimited

Example Input

{
"post_urls": [
{ "url": "https://www.reddit.com/r/Damnthatsinteresting/comments/1sa33fi/" },
{ "url": "https://www.reddit.com/r/Wellthatsucks/comments/1s9q97r/" },
{ "url": "https://www.reddit.com/r/AITAH/comments/1s9vzep/" },
{ "url": "https://www.reddit.com/user/DavidFromNeo/comments/1rj3crp/" }
],
"include_comments": false,
"max_results": 0
}

Output Structure

The Reddit Post Media Downloader and Metadata Scraper produces records tagged with _item_type for easy filtering.

Post Record (_item_type: "post")

Every URL produces a post record with 60+ metadata fields:

FieldTypeDescription
post_idstringReddit post ID
titlestringPost title
authorstringUsername
subredditstringSubreddit name
post_typestringAuto-detected: video, image, gallery, text, link, rich_video, ad
urlstringTarget URL
permalinkstringFull Reddit permalink
selftextstringPost body text
scoreintegerNet upvotes
upvote_ratiofloatUpvote ratio (0.0–1.0)
num_commentsintegerComment count
created_atstringISO-8601 creation time
domainstringContent domain
media_urlstringMedia destination URL
is_videobooleanVideo post flag
is_gallerybooleanGallery post flag
over_18booleanNSFW flag
flairstringPost flair text
is_promotedbooleanPromoted post flag
is_adbooleanAd post flag
video_infoobjectVideo: fallback_url, dash_url, hls_url, width, height, duration, has_audio
preview_imagesarrayAll resolution preview images
gallery_itemsarrayGallery items: url, width, height, caption, media_type
oembedobjectExternal embed: provider_name, title, html, thumbnail_url
call_to_actionstringAd CTA text
href_urlstringAd destination URL

Media Record (_item_type: "media")

For every extractable media element, a separate media record:

FieldTypeDescription
download_urlstringBest quality direct download URL
audio_urlstringSeparate audio stream (Reddit videos)
media_typestringvideo, image, gif, gallery, rich_video
source_urlstringOriginal source URL
widthintegerWidth in pixels
heightintegerHeight in pixels
durationfloatDuration in seconds (video)
extstringFile extension
thumbnail_urlstringThumbnail URL
titlestringMedia title
uploaderstringPost author
subredditstringSubreddit name
reddit_post_urlstringSource Reddit post
embed_urlstringExternal embed URL
provider_namestringExternal provider (YouTube, Vimeo)
is_reddit_hostedbooleanTrue = Reddit-hosted, False = external
all_formatsarrayAll available quality options

Comment Record (_item_type: "comment") — Only When Toggle Is ON

FieldTypeDescription
comment_idstringComment ID
authorstringComment author
bodystringComment text
scoreintegerComment score
created_atstringISO-8601 time
depthintegerNesting depth
is_submitterbooleanTrue if comment by post author

How the Comments Toggle Works

The Reddit Post Media Downloader and Metadata Scraper includes a 💬 Include Comments checkbox:

  • OFF (default) — No comments in output. Only posts and media records are returned.
  • ON — Comments are included as separate records with _item_type: "comment".

This lets you control output size. A popular post can have thousands of comments — toggle OFF to keep results focused on media data.


How max_results Works

The max_results filter limits the number of items returned per bucket (posts, media, comments). It does NOT specifically control comment count — it applies equally to all output types.

  • max_results: 0 — Return everything (default)
  • max_results: 10 — Return at most 10 posts, 10 media items, and 10 comments

If you don't need comments at all, use the include_comments toggle instead.


Video Download Example

{
"_item_type": "media",
"media_type": "video",
"download_url": "https://v.redd.it/93jz90kreosg1/CMAF_720.mp4?source=fallback",
"audio_url": "https://v.redd.it/93jz90kreosg1/DASH_audio.mp4",
"width": 1280,
"height": 720,
"duration": 123.0,
"all_formats": [
{ "label": "fallback", "url": "https://v.redd.it/.../CMAF_720.mp4?source=fallback", "type": "mp4/video" },
{ "label": "dash", "url": "https://v.redd.it/.../DASHPlaylist.mpd", "type": "dash" },
{ "label": "hls", "url": "https://v.redd.it/.../HLSPlaylist.m3u8", "type": "hls" },
{ "label": "audio", "url": "https://v.redd.it/.../DASH_audio.mp4", "type": "mp4/audio" }
]
}

Gallery/Carousel Example

For a post with 5 images, the Reddit Post Media Downloader and Metadata Scraper returns 5 separate media records — one per gallery item:

{
"_item_type": "media",
"media_type": "image",
"download_url": "https://i.redd.it/abc123.jpg",
"width": 1920,
"height": 1080,
"ext": "jpeg",
"title": "Gallery item caption"
}

For gallery and carousel posts, the Reddit Post Media Downloader creates a separate media record for every item in the carousel. This includes:

  • Multi-image carousels — each image gets its own download URL
  • Mixed media carousels — images and videos in the same post are extracted with the correct media type
  • Captions and outbound links per item
[
{
"media_id": "abc123",
"title": "Gallery image 1",
"media_type": "image",
"download_url": "https://preview.redd.it/abc123.jpeg?...",
"width": 4284,
"height": 5712
},
{
"media_id": "def456",
"title": "Gallery video",
"media_type": "video",
"download_url": "https://preview.redd.it/def456.mp4?...",
"width": 1920,
"height": 1080
}
]

When the Reddit Post Media Downloader finds a post linking to YouTube, Vimeo, or other external sites, it extracts the full oembed metadata including:

  • The external video URL
  • Embed iframe HTML (for embedding on your own site)
  • Provider info (name, author, channel URL)
  • Thumbnail image URL and dimensions

⚡ Performance & Reliability

The Reddit Post Media Downloader uses multiple parallel extraction channels running simultaneously to ensure:

  • Maximum data completeness — if one channel is throttled, others compensate
  • Fast extraction — parallel processing means results in seconds, not minutes
  • Automatic retry — built-in retry logic with fresh request signatures
  • Anti-detection — randomized request patterns to avoid blocks
  • Proxy support — US residential proxy enabled automatically on Apify

📋 Output Metadata Fields

Every record pushed to the dataset includes these system fields:

FieldTypeDescription
_item_typestringRecord type: "post", "media", "comment", "trophy", "subreddit_rule"
_scraper_typestringThe extraction mode used
_reddit_inputstringThe original input URL
_scraped_atstringISO-8601 timestamp of when the data was collected

💡 Use Cases for Reddit Post Media Downloader

  • Content archival — Save complete post data and media before deletion
  • Media collection — Download Reddit videos, images, galleries in bulk
  • Research & analytics — Collect structured post metadata for analysis
  • Social monitoring — Track post performance (score, comments, awards)
  • Content curation — Filter and collect media by subreddit, type, or engagement
  • Ad intelligence — Extract promoted post details and targeting data

🔒 Privacy & Compliance

The Reddit Post Media Downloader only accesses publicly available data through Reddit's public endpoints. No authentication or login credentials are required. Please ensure your use of this tool complies with Reddit's Terms of Service and applicable data privacy regulations.


📄 License

This project is provided as-is. Use responsibly and in accordance with applicable terms of service.