# Reddit Media & Images Scraper (`scrapers_lat/reddit-media-scraper`) Actor

Extract every image, video, GIF and gallery URL from Reddit subreddits and post URLs as JSON, CSV or Excel for ML datasets and content curation.

- **URL**: https://apify.com/scrapers\_lat/reddit-media-scraper.md
- **Developed by:** [Scrapers Lat](https://apify.com/scrapers_lat) (community)
- **Categories:** Social media, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $10.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

<!-- actor-banner -->
[![Reddit Media & Images Scraper](https://scrapers.lat/banners/reddit-media-scraper.png)](https://console.apify.com/actors/cnhAQoPkmzJasI5aF/input)
<!-- /actor-banner -->

## Reddit Media & Images Scraper

> Pull every image, video, GIF and gallery URL from any subreddit or post, one clean record per media item, ready for ML datasets and content curation.

![Apify](https://img.shields.io/badge/Platform-Apify-1CE1CE?logo=apify&logoColor=white)
![Coverage](https://img.shields.io/badge/Coverage-All%20Reddit-blue)
![Maintained](https://img.shields.io/badge/Maintained-Yes-brightgreen)
![Output](https://img.shields.io/badge/Output-JSON%20%7C%20CSV%20%7C%20Excel-orange)

<table><tr>
<td align="center"><strong>15 fields</strong><br>per media item</td>
<td align="center"><strong>All Reddit</strong><br>coverage</td>
<td align="center"><strong>JSON / CSV / Excel</strong><br>output formats</td>
<td align="center"><strong>Updated</strong><br>2026-06-28</td>
</tr></table>

<br>

### What you get

One record per media item, not per post, so a single gallery post expands into one row per image. Every direct image, video, GIF and gallery image is captured, with the source post attached for attribution. Use it to build training datasets, curate content, mirror media or audit what a community is posting.

- **mediaUrl**: the direct URL of the image, video, GIF or gallery image
- **mediaType**: image, video, gif or gallery
- **sourceDomain**: the host serving the media (i.redd.it, v.redd.it, imgur.com and so on)
- **width**: pixel width when the source exposes it
- **height**: pixel height when the source exposes it
- **postId**: the Reddit post id the media came from
- **postTitle**: the title of the source post
- **subreddit**: the community the post is in
- **author**: the poster's username
- **score**: net upvotes on the source post
- **numComments**: comment count on the source post
- **permalink**: the canonical reddit.com link to the source post
- **isNsfw**: whether the source post is marked NSFW
- **createdAt**: when the source post was submitted
- **observedAt**: when this media item was last seen by the scraper

### Two ways to scrape

| Mode | Input | Use it for |
|---|---|---|
| Subreddit feeds | `subreddits` | Sweep hot/new/top/rising media from communities |
| Post URLs | `postUrls` | Pull every media item from exact posts you already have links to |

Combine them in one run. `maxItems` caps the total media items; `mediaTypes` narrows to the kinds you want; `includeExternal` toggles off-Reddit hosts.

### Who is it for

| Use case | Who benefits |
|---|---|
| Building image and video training sets | ML and computer vision engineers |
| Content curation and moodboards | Designers and creative teams |
| Trend and meme tracking | Researchers and social analysts |
| Media archiving and mirroring | Community archivists |
| Brand and asset monitoring | Marketing and PR teams |

### Frequently Asked Questions

**What media does it pull from Reddit?**
Direct images and GIFs from i.redd.it, videos from v.redd.it (with the highest mp4 fallback when available), and every image inside a multi-image gallery. Posts with no media are skipped, not returned as errors.

**Does it include media hosted off Reddit?**
Yes, by default. With `includeExternal` enabled it captures links to imgur, gfycat, redgifs and streamable. Turn it off to keep only Reddit-hosted media on i.redd.it and v.redd.it.

**How is maxItems counted?**
It counts media items, not posts. A single gallery post with ten images counts as ten toward your `maxItems`, since the scraper emits one record per image.

**Can I limit it to only images or only videos?**
Yes. Set `mediaTypes` to any subset of image, video, gif and gallery. Leave all four selected to capture everything a community posts.

**What happens with a post that fails to load?**
A post that fails to load is written as a record with an `error` field instead of being silently dropped. Posts that simply have no media are skipped without an error.

<!-- example-tasks -->
### Example use cases

Ready-to-run example tasks, each preconfigured for a common scenario. Open one and press run, or use it as a template:

- [Reddit Art and Wallpapers](https://apify.com/scrapers_lat/reddit-media-scraper/examples/reddit-art-wallpapers): Collect images from r/Art and r/wallpapers with direct media URLs, title, author and score for datasets.
- [Reddit Videos and GIFs from r/aww](https://apify.com/scrapers_lat/reddit-media-scraper/examples/reddit-aww-videos-gifs): Pull videos and gifs from r/aww with direct media URLs, post title, author and engagement scores.
- [Reddit Top Images from r/pics](https://apify.com/scrapers_lat/reddit-media-scraper/examples/reddit-pics-top-images): Scrape top images and galleries from r/pics and r/EarthPorn with direct media URLs, title and score.

<!-- /example-tasks -->

<!-- related-actors -->
### Related scrapers

Need data from the same space? Here are other scrapers we build and maintain:

- [Reddit Posts Scraper](https://apify.com/scrapers_lat/reddit-posts-scraper): Extract Reddit posts with full text, scores, awards and gallery data from subreddits, search and post URLs.
- [Reddit Posts & Comments Scraper](https://apify.com/scrapers_lat/reddit-scraper): Extract Reddit posts and comments from subreddits and search results using the public Reddit feeds.
- [X (Twitter) Profiles & Tweets Scraper](https://apify.com/scrapers_lat/x-twitter-scraper): Extract public X (Twitter) tweets by tweet ID: text, author, likes, replies, media and timestamps.
- [Instagram Profile & Posts Scraper](https://apify.com/scrapers_lat/instagram-scraper): Extract public Instagram profiles and recent posts by username without login.
- [YouTube Scraper](https://apify.com/scrapers_lat/youtube-scraper): Scrape YouTube videos and channels by search query, video URL or channel URL.
- [TikTok Creative Center Top Ads Scraper](https://apify.com/scrapers_lat/tiktok-creative-center-scraper): Scrape top-performing TikTok ads from the public Creative Center by country and time period.

<!-- /related-actors -->

<!-- scrapers-lat-cta -->
### More scrapers at scrapers.lat

This actor is built and maintained by [scrapers.lat](https://scrapers.lat), where we publish scrapers for Latin American and US public platforms: real estate, jobs, e-commerce, company registries and government data. Browse the full catalog, see live sample output for each one, or ask us for a custom scraper at [scrapers.lat](https://scrapers.lat).

---

> This actor is an independent tool and has no affiliation with Reddit, Inc. It only accesses data that is publicly available on Reddit. Use it in accordance with Reddit's terms of service.
<!-- /scrapers-lat-cta -->

# Actor input Schema

## `maxItems` (type: `integer`):

Maximum number of media items (images, videos, GIFs, gallery images) to collect. Note this counts media items, not posts. Optional.
## `subreddits` (type: `array`):

Subreddit names to pull media from. Enter names without the r/ prefix, for example: pics, aww.
## `postUrls` (type: `array`):

Optional. Exact Reddit post URLs to pull media from directly (https://www.reddit.com/r/.../comments/...).
## `sort` (type: `string`):

Sort order for subreddit feeds.
## `timeFilter` (type: `string`):

Time window for the 'top' sort. Ignored for hot, new and rising.
## `mediaTypes` (type: `array`):

Which kinds of media to keep. Leave all selected to capture everything.
## `includeExternal` (type: `boolean`):

When enabled, also capture media hosted off Reddit (imgur, gfycat, redgifs, streamable). Disable to keep only Reddit-hosted media (i.redd.it, v.redd.it).
## `proxyConfiguration` (type: `object`):

Proxy settings. Reddit blocks datacenter ranges, so the Apify Residential proxy is required and set as the default.

## Actor input object example

```json
{
  "maxItems": 50,
  "subreddits": [
    "pics"
  ],
  "postUrls": [],
  "sort": "hot",
  "timeFilter": "week",
  "mediaTypes": [
    "image",
    "video",
    "gif",
    "gallery"
  ],
  "includeExternal": true,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
````

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 50,
    "subreddits": [
        "pics"
    ],
    "postUrls": []
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapers_lat/reddit-media-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 50,
    "subreddits": ["pics"],
    "postUrls": [],
}

# Run the Actor and wait for it to finish
run = client.actor("scrapers_lat/reddit-media-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 50,
  "subreddits": [
    "pics"
  ],
  "postUrls": []
}' |
apify call scrapers_lat/reddit-media-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapers_lat/reddit-media-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Media & Images Scraper",
        "description": "Extract every image, video, GIF and gallery URL from Reddit subreddits and post URLs as JSON, CSV or Excel for ML datasets and content curation.",
        "version": "0.1",
        "x-build-id": "xBQpCIHTwkqb5kOgS"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapers_lat~reddit-media-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapers_lat-reddit-media-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapers_lat~reddit-media-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scrapers_lat-reddit-media-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapers_lat~reddit-media-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scrapers_lat-reddit-media-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "proxyConfiguration"
                ],
                "properties": {
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Maximum number of media items (images, videos, GIFs, gallery images) to collect. Note this counts media items, not posts. Optional."
                    },
                    "subreddits": {
                        "title": "Subreddits",
                        "type": "array",
                        "description": "Subreddit names to pull media from. Enter names without the r/ prefix, for example: pics, aww.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "postUrls": {
                        "title": "Post URLs",
                        "type": "array",
                        "description": "Optional. Exact Reddit post URLs to pull media from directly (https://www.reddit.com/r/.../comments/...).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sort": {
                        "title": "Sort",
                        "enum": [
                            "hot",
                            "new",
                            "top",
                            "rising"
                        ],
                        "type": "string",
                        "description": "Sort order for subreddit feeds.",
                        "default": "hot"
                    },
                    "timeFilter": {
                        "title": "Time Range",
                        "enum": [
                            "hour",
                            "day",
                            "week",
                            "month",
                            "year",
                            "all"
                        ],
                        "type": "string",
                        "description": "Time window for the 'top' sort. Ignored for hot, new and rising.",
                        "default": "week"
                    },
                    "mediaTypes": {
                        "title": "Media Types",
                        "type": "array",
                        "description": "Which kinds of media to keep. Leave all selected to capture everything.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "image",
                                "video",
                                "gif",
                                "gallery"
                            ],
                            "enumTitles": [
                                "Images",
                                "Videos",
                                "GIFs",
                                "Gallery images"
                            ]
                        },
                        "default": [
                            "image",
                            "video",
                            "gif",
                            "gallery"
                        ]
                    },
                    "includeExternal": {
                        "title": "Include External Media",
                        "type": "boolean",
                        "description": "When enabled, also capture media hosted off Reddit (imgur, gfycat, redgifs, streamable). Disable to keep only Reddit-hosted media (i.redd.it, v.redd.it).",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings. Reddit blocks datacenter ranges, so the Apify Residential proxy is required and set as the default.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```