# PR Newswire Scraper (`hgservices/pr-newswire-scraper`) Actor

Extract press releases from PR Newswire by keyword, company, industry, or date range. Get full text, publication dates, media contacts, and more.

- **URL**: https://apify.com/hgservices/pr-newswire-scraper.md
- **Developed by:** [Harish Garg](https://apify.com/hgservices) (community)
- **Categories:** Lead generation, News, Automation
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## PR Newswire Scraper

Search and export press releases from **PR Newswire** in seconds. This Actor gives you a **full-text search engine** over a continuously updated index of PR Newswire press releases, with date filtering, BM25 relevance ranking, and clean structured output ready for CSV, Excel, JSON, or API consumption.

Use it to monitor competitors, track M&A and earnings announcements, build financial datasets, power media monitoring tools, or feed downstream LLM pipelines with grounded, sourceable press release data.

### Why this Actor

Get clean, structured press releases from PR Newswire in seconds — searchable by keyword, filterable by date, and ready to drop into a spreadsheet, database, or LLM pipeline.

- **Press releases, on demand.** Pull the latest releases, a date range, or every release matching a keyword — without running a browser or parsing HTML yourself.
- **Search like Google, not like grep.** Prefix matching, diacritic folding, and BM25 ranking mean `cafe merger` finds `Café Holdings announces merger` without you wrestling with regex.
- **Title-weighted relevance.** Headline matches rank 10× higher than body matches, so the most relevant releases surface first.
- **Date range filtering.** Pull every release between any two dates with one parameter — perfect for quarterly research, event windows, or backfilling datasets.
- **Structured output.** Every row is clean JSON with stable field names. Pipe straight into Sheets, BigQuery, a vector DB, or your own ETL.
- **No scraping fragility on your side.** You don't run a browser, manage proxies, or parse HTML. You query an index. Runs finish in seconds, not hours.

### Common use cases

- **Competitive intelligence** — track product launches, leadership changes, and partnership announcements across an industry.
- **Financial & investor research** — surface earnings, guidance updates, M&A activity, and IPO filings on a date range.
- **Media monitoring & PR analytics** — measure share-of-voice for a brand, executive, or topic over time.
- **Dataset building for AI/ML** — bulk-export press releases as a labeled corpus for fine-tuning, RAG, or sentiment analysis.
- **Lead generation** — find companies announcing funding rounds, expansions, or hiring sprees.
- **Compliance & legal discovery** — pull every public statement from a company in a defined window.

### Output: what you get back

Each row in the dataset contains:

| Field       | Description                                                              |
| ----------- | ------------------------------------------------------------------------ |
| `id`        | Internal press release ID                                                |
| `url`       | Canonical PR Newswire URL                                                |
| `title`     | Press release headline                                                   |
| `pubDate`   | Publication date (ISO `YYYY-MM-DD`)                                      |
| `timestamp` | Publication timestamp as scraped from the source                         |
| `body`      | Full press release body text                                             |

Export with one click as **JSON, CSV, Excel, XML, RSS, or HTML** from the Apify dataset view, or pull programmatically via the Apify API.

### Input parameters

All fields are optional — combine them freely.

| Field         | Type    | Description                                                                                                          |
| ------------- | ------- | -------------------------------------------------------------------------------------------------------------------- |
| `query`       | string  | Full-text search across title and content. Tokens are prefix-matched and AND-combined. Leave empty to browse by date. |
| `titleOnly`   | boolean | Restrict `query` to the title column. Default `false`.                                                               |
| `fromDate`    | string  | Lower bound for `pubDate` (inclusive), `YYYY-MM-DD`.                                                                 |
| `toDate`      | string  | Upper bound for `pubDate` (inclusive), `YYYY-MM-DD`.                                                                 |
| `maxRecords`  | integer | Cap on rows returned. Default `100`, max `1000`.                                                                     |

### Example queries

**Browse the 50 most recent press releases:**

```json
{ "maxRecords": 50 }
````

**Search for earnings news in 2026:**

```json
{ "query": "earnings", "fromDate": "2026-01-01", "maxRecords": 20 }
```

**Find M\&A headlines only (title match):**

```json
{ "query": "merger acquisition", "titleOnly": true, "maxRecords": 10 }
```

**Pull every release mentioning a company in Q1:**

```json
{ "query": "tesla", "fromDate": "2026-01-01", "toDate": "2026-03-31", "maxRecords": 1000 }
```

**Get full body text for downstream LLM/RAG use:**

```json
{ "query": "guidance raised", "maxRecords": 25 }
```

### How search works

- **No query** → results sorted by `pubDate` descending (newest first).
- **With query** → results sorted by **BM25 relevance**, with title hits weighted 10× heavier than body hits.
- **Prefix matching** — `trump rally` matches rows containing both `trump*` and `rally*`, in any order, in any column.
- **Diacritic-insensitive** — `cafe` matches `Café`. The tokenizer is `unicode61` with diacritic folding.
- **Date filters** apply to the ISO `pubDate` field; rows with a missing publication date are excluded when a date filter is set.

### Limits & pricing

- `maxRecords` is capped at **1,000 rows per run**. To pull larger archives, slice by date range across multiple runs.
- Runs are typically sub-second on the index — you pay for compute time, which is minimal.

### FAQ

**Can I integrate this with my app?**
Yes — every Apify Actor exposes a REST API and webhooks. Trigger runs from your backend and read results from the dataset endpoint.

***

Questions, feature requests, or want a custom actor? Reach out via the Apify Console — feedback drives the roadmap.

# Actor input Schema

## `query` (type: `string`):

Full-text search across press release titles and content. Leave empty to browse by date only.

## `titleOnly` (type: `boolean`):

When enabled, the search query is matched only against press release titles (ignored if no query is given).

## `fromDate` (type: `string`):

Lower bound for pub\_date (inclusive). ISO format YYYY-MM-DD.

## `toDate` (type: `string`):

Upper bound for pub\_date (inclusive). ISO format YYYY-MM-DD.

## `maxRecords` (type: `integer`):

Maximum number of press releases to return per run. Capped at 1000 to keep runs predictable; narrow the date range or query to find more.

## Actor input object example

```json
{
  "query": "earnings",
  "titleOnly": false,
  "fromDate": "2026-01-01",
  "toDate": "2026-04-30",
  "maxRecords": 100
}
```

# Actor output Schema

## `dataset` (type: `string`):

Matching press releases sorted by relevance (when a query is given) or by publication date.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "earnings"
};

// Run the Actor and wait for it to finish
const run = await client.actor("hgservices/pr-newswire-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "query": "earnings" }

# Run the Actor and wait for it to finish
run = client.actor("hgservices/pr-newswire-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "earnings"
}' |
apify call hgservices/pr-newswire-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=hgservices/pr-newswire-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "PR Newswire Scraper",
        "description": "Extract press releases from PR Newswire by keyword, company, industry, or date range. Get full text, publication dates, media contacts, and more.",
        "version": "0.1",
        "x-build-id": "jfpXWdDzEzpSDDStJ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/hgservices~pr-newswire-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-hgservices-pr-newswire-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/hgservices~pr-newswire-scraper/runs": {
            "post": {
                "operationId": "runs-sync-hgservices-pr-newswire-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/hgservices~pr-newswire-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-hgservices-pr-newswire-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Full-text search across press release titles and content. Leave empty to browse by date only."
                    },
                    "titleOnly": {
                        "title": "Search titles only",
                        "type": "boolean",
                        "description": "When enabled, the search query is matched only against press release titles (ignored if no query is given).",
                        "default": false
                    },
                    "fromDate": {
                        "title": "From date",
                        "type": "string",
                        "description": "Lower bound for pub_date (inclusive). ISO format YYYY-MM-DD."
                    },
                    "toDate": {
                        "title": "To date",
                        "type": "string",
                        "description": "Upper bound for pub_date (inclusive). ISO format YYYY-MM-DD."
                    },
                    "maxRecords": {
                        "title": "Max records",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of press releases to return per run. Capped at 1000 to keep runs predictable; narrow the date range or query to find more.",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
