# 404 Media Articles Scraper | Tech Investigative News (`parseforge/404media-articles-scraper`) Actor

Collect 404 Media articles with title, author, publication date, full body, and tags. Filter by section, topic, or keyword. Built for tech journalists, AI researchers, and media monitoring teams tracking investigative tech reporting on platforms, AI, and digital culture.

- **URL**: https://apify.com/parseforge/404media-articles-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** News, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $19.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🛰️ 404 Media Articles Scraper

> 🚀 **Export 404 Media articles in seconds.** Pull the latest tech investigative journalism with title, link, author, image, and summary in one run.

> 🕒 **Last updated:** 2026-05-25 · **📊 10 fields** per record · **Latest 15 to 30 articles** per run · **Worldwide tech coverage**

404 Media is the journalist-owned tech publication founded by former Motherboard staff covering AI, surveillance, hacking, online culture, and tech labor. This actor exports the latest 404 Media articles with title, link, author, hero image, categories, publish date, and summary, all from the public RSS feed in real time.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Tech journalists, PR teams | Track 404 Media coverage of AI, platforms, surveillance |
| Newsletter editors | Curate weekly tech investigative reads |
| Researchers, academics | Compile reading lists on tech labor and culture |
| Sentiment analysts | Feed article text into NLP pipelines |

### 📋 What the 404 Media Articles Scraper does

- Fetches the live 404 Media RSS feed in real time
- Extracts hero image, title, URL, author, categories, publish date, and summary
- Returns clean records ready for CSV, Excel, JSON, or XML export
- Decodes HTML entities and strips inline markup
- Limits records to the count you choose with maxItems

> 💡 **Why it matters:** 404 Media breaks original tech stories that often shape the broader news cycle. This scraper turns the feed into a structured dataset you can index, search, or pipe into automations.

### 🎬 Full Demo

_🚧 Coming soon_

### ⚙️ Input

<table>
<thead><tr><th>Field</th><th>Type</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>maxItems</code></td><td>integer</td><td>Free users limited to 10 items. Paid users up to 1,000,000. Defaults to 10.</td></tr>
</tbody>
</table>

```json
{ "maxItems": 5 }
````

```json
{ "maxItems": 30 }
```

> ⚠️ **Good to Know:** The 404 Media RSS feed exposes the most recent ~20-30 articles. Schedule the actor for ongoing coverage.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 🖼️ `imageUrl` | string | Hero image URL |
| 📌 `title` | string | Article headline |
| 🔗 `url` | string | Canonical 404 Media article URL |
| ✍️ `author` | string | Article author |
| 🏷️ `categories` | string\[] | Article categories |
| 🗓️ `publishedAt` | ISO date | Original publish timestamp |
| 📝 `summary` | string | Article description, HTML stripped |
| 📡 `source` | string | Always "404 Media" |
| 🕒 `scrapedAt` | ISO date | When this record was collected |
| ❌ `error` | string | Null on success |

Real sample records:

```json
{
  "imageUrl": "https://images.unsplash.com/photo-1715026323270-564a788bbadc?...",
  "title": "An Incomplete List of Successful Anti-Data Center Legislation",
  "url": "https://www.404media.co/an-incomplete-list-of-successful-anti-data-center-legislation/",
  "author": "Matthew Gault",
  "categories": ["News"],
  "publishedAt": "2026-05-25T13:00:30.000Z",
  "summary": "No one wants to live next to a noisy computer warehouse and communities across the country are successfully fighting them.",
  "source": "404 Media"
}
```

```json
{
  "title": "Corpse Point in the Arctic Is Melting, Disturbing Centuries-Old Bodies",
  "url": "https://www.404media.co/corpse-point-in-the-arctic-is-melting-disturbing-centuries-old-bodies/",
  "publishedAt": "2026-05-23T13:00:55.000Z",
  "source": "404 Media"
}
```

```json
{
  "title": "Here's the Bodycam Footage of the Cybertruck That Drove Into a Lake",
  "url": "https://www.404media.co/heres-the-bodycam-footage-of-the-cybertruck-that-drove-into-a-lake/",
  "publishedAt": "2026-05-22T21:03:03.000Z",
  "source": "404 Media"
}
```

### ✨ Why choose this Actor

- ⚡ Live data, no caching, ~3-second runs
- 🧼 Clean text, decoded entities, no inline HTML
- 🪶 No proxy required, lightweight footprint
- 🧪 Stable schema
- 🆓 Free tier: 10 articles per run

### 📈 How it compares to alternatives

| Approach | Speed | Setup | Structured | Cost |
|---|---|---|---|---|
| ParseForge 404 Media Scraper | Fast | None | Yes | Pay-per-event |
| Manual copy from 404media.co | Slow | None | No | Free |
| RSS reader app | Fast | Account | Partial | Free / paid |
| Custom scraper | Slow | Code | Yes | Dev time |

### 🚀 How to use

1. [Create a free Apify account](https://console.apify.com/sign-up?fpr=vmoqkp) with $5 starter credit
2. Open the actor page on Apify Store
3. Set `maxItems`
4. Click "Run"
5. Download as CSV, Excel, JSON, or XML

### 💼 Business use cases

**PR and brand monitoring** — Track 404 Media coverage of your company, products, or industry.

**Editorial intelligence** — Auto-curate weekly tech investigative reading lists.

**Research and academia** — Study tech journalism patterns and topic distribution.

**Sentiment analysis** — Feed article text into NLP models.

### 🔌 Automating 404 Media Articles Scraper

Connect to Make, Zapier, n8n, Airbyte, Slack, Google Drive, GitHub Actions, or any HTTP-capable platform.

### 🌟 Beyond business use cases

**Academic research** — Study tech journalism economics over time.

**Personal reading** — Build a weekly digest of 404 Media stories.

**Non-profit advocacy** — Track coverage of surveillance, AI ethics, tech labor.

**Creative experimentation** — Use headlines as story prompts.

### 🤖 Ask an AI assistant about this scraper

Paste this README into [ChatGPT](https://chat.openai.com), [Claude](https://claude.ai), [Perplexity](https://www.perplexity.ai), or [Microsoft Copilot](https://copilot.microsoft.com).

### ❓ Frequently Asked Questions

**❓ Is this affiliated with 404 Media?** No. Independent tool, public RSS data only.

**❓ How many articles can I get per run?** Up to ~30 most recent from the RSS feed.

**❓ Full article body?** No, summaries only. 404 Media is subscriber supported.

**❓ Freshness?** Real-time, every run hits the live feed.

**❓ API key?** Not needed.

**❓ Filter by author?** Filter the output dataset in your BI tool.

**❓ Proxy?** Not required.

**❓ Clean summaries?** Yes, HTML stripped, entities decoded.

**❓ Scheduling?** Use Apify Schedules.

**❓ Output format?** CSV, Excel, JSON, XML.

### 🔌 Integrate with any app

Make, Zapier, n8n, Airbyte, Slack, Google Sheets, Google Drive, Microsoft Teams, Notion, Airtable, BigQuery, Snowflake, GitHub Actions, AWS Lambda, plus any REST-capable system.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [Defector Articles Scraper](https://apify.com/parseforge/defector-articles-scraper) | Independent sports and culture journalism |
| [Hacker News Scraper](https://apify.com/parseforge/hacker-news-scraper) | Hacker News front page |
| [Techmeme Scraper](https://apify.com/parseforge/techmeme-scraper) | Tech industry news aggregator |
| [Slashdot Scraper](https://apify.com/parseforge/slashdot-scraper) | Slashdot stories |
| [Reddit Scraper](https://apify.com/parseforge/reddit-scraper) | Reddit posts and subreddits |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more news scrapers.

**🆘 Need Help?** [Open our contact form](https://tally.so/r/BzdKgA)

> **⚠️ Disclaimer:** independent tool, not affiliated with 404 Media. Only publicly available RSS data is collected.

# Actor input Schema

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## Actor input object example

```json
{
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/404media-articles-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxItems": 10 }

# Run the Actor and wait for it to finish
run = client.actor("parseforge/404media-articles-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10
}' |
apify call parseforge/404media-articles-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/404media-articles-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "404 Media Articles Scraper | Tech Investigative News",
        "description": "Collect 404 Media articles with title, author, publication date, full body, and tags. Filter by section, topic, or keyword. Built for tech journalists, AI researchers, and media monitoring teams tracking investigative tech reporting on platforms, AI, and digital culture.",
        "version": "0.1",
        "x-build-id": "4Kbqt8dP2RRfeof6X"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~404media-articles-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-404media-articles-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~404media-articles-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-404media-articles-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~404media-articles-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-404media-articles-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
