# The Guardian Article Search & Archive Scraper (`parseforge/guardian-content-search-scraper`) Actor

Search The Guardian's full article archive (2.6M+ articles since 1999). Filter by query, section, tag, contributor, date, or production office. Returns headline, byline, body, tags, contributors, and publication metadata.

- **URL**: https://apify.com/parseforge/guardian-content-search-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** News, Business, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $29.62 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 📰 The Guardian Article Search Scraper

> 🚀 **Search 2.6 million Guardian articles in seconds.** Headlines, bylines, full body text, tags, contributors, star ratings, and section metadata across the complete archive since 1999. No sign-up, no manual scraping.

> 🕒 **Last updated:** 2026-05-15 · **📊 30 fields** per article · **📰 2.6M+ articles** · **📂 32 sections** · **📅 Archive since 1999**

The **Guardian Article Search Scraper** exports articles from **The Guardian** and returns **30 fields per record**, including headline, byline, full body text and HTML, contributors, tags, section metadata, star ratings for reviews, and image gallery URLs. The Guardian archive is one of the most-cited English-language news corpora in academic research, NLP training, and media-trends analysis.

The catalogue covers **2.6 million-plus articles across 32 sections**, including World, UK, US, Australia, Politics, Business, Technology, Science, Environment, Sport, Culture, and Opinion, with full archive coverage from 1999 onward. This Actor makes the corpus searchable as CSV, Excel, JSON, or XML in under a minute. Filtering by section, tag, contributor, date, language, production office, and minimum star rating runs server-side.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Media-monitoring teams, NLP researchers, journalism students, data scientists, content strategists, OSINT analysts, librarians | Brand mentions tracking, sentiment & topic models, journalism research, media-bias studies, archival queries, training corpora for LLMs |

---

### 📋 What the Guardian Article Search Scraper does

Six powerful filters in a single run:

- 🔍 **Free-text search.** Operators include AND, OR, NOT, and quoted phrases.
- 📂 **Section filter.** Pick one of 32 sections or search every section.
- 🏷️ **Tag filter.** Combine multiple Guardian tags (e.g. `environment/climate-change`, `football/premierleague`).
- 📅 **Date range.** Restrict by `fromDate` and `toDate`.
- 🌍 **Production office.** Filter by UK, US, Australia, or international edition.
- ⭐ **Minimum star rating.** Pull only 4-star and above film, TV, music, or restaurant reviews.

Each record includes the article ID, section, pillar, byline, contributors, full body text and HTML, image gallery URLs, word count, star rating (where applicable), and live-blog status.

> 💡 **Why it matters:** The Guardian is one of the most influential English-language newsrooms. Its archive is cited in NLP papers, media-bias studies, and journalism education. Building your own pipeline means parsing the article search response and reconstructing tag taxonomies. This Actor skips all of that.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing climate-change coverage exported to a research notebook._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td><code>maxItems</code></td><td>integer</td><td><code>10</code></td><td>Articles to return. Free plan caps at 10, paid plan at 1,000,000.</td></tr>
<tr><td><code>query</code></td><td>string</td><td><code>"climate change"</code></td><td>Free-text search with AND, OR, NOT, and quoted phrases.</td></tr>
<tr><td><code>section</code></td><td>string</td><td><code>"all"</code></td><td>One of 32 sections or <code>all</code>.</td></tr>
<tr><td><code>tag</code></td><td>string</td><td><code>""</code></td><td>Comma-separated Guardian tags.</td></tr>
<tr><td><code>fromDate</code>, <code>toDate</code></td><td>string</td><td><code>""</code></td><td>YYYY-MM-DD bounds.</td></tr>
<tr><td><code>productionOffice</code></td><td>string</td><td><code>"any"</code></td><td>UK, US, Australia, or international edition.</td></tr>
<tr><td><code>orderBy</code></td><td>string</td><td><code>"newest"</code></td><td><code>newest</code>, <code>oldest</code>, or <code>relevance</code>.</td></tr>
<tr><td><code>lang</code></td><td>string</td><td><code>""</code></td><td>Language code (en, fr, es, de, ar, etc.).</td></tr>
<tr><td><code>starRating</code></td><td>integer</td><td>—</td><td>Minimum star rating (1-5) for reviews.</td></tr>
<tr><td><code>includeBodyText</code></td><td>boolean</td><td><code>true</code></td><td>Include the full article body text and HTML.</td></tr>
</tbody>
</table>

**Example: latest 100 climate-change articles in the Environment section.**

```json
{
    "maxItems": 100,
    "query": "climate change",
    "section": "environment",
    "orderBy": "newest"
}
````

**Example: 4-star and above film reviews from 2024.**

```json
{
    "maxItems": 50,
    "query": "",
    "section": "film",
    "starRating": 4,
    "fromDate": "2024-01-01",
    "toDate": "2024-12-31",
    "orderBy": "relevance"
}
```

> ⚠️ **Good to Know:** Guardian tag IDs follow the pattern `section/topic` (e.g. `football/premierleague`, `profile/jonathan-freedland`). Reviews live in sections like `film`, `tv-and-radio`, `music`, `books`, and `food`. Star ratings only appear on review-type articles.

***

### 📊 Output

Each article record contains **30 fields**. Download the dataset as CSV, Excel, JSON, or XML.

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 🖼️ `imageUrl` | string | null | `"https://i.guim.co.uk/img/.../1000.jpg"` |
| 🆔 `id` | string | `"environment/2026/may/14/climate-policy-..."` |
| 📌 `webTitle` | string | `"Climate policy review prompts ..."` |
| 📌 `headline` | string | `"Climate policy review prompts ..."` |
| 🔗 `webUrl` | string | `"https://www.theguardian.com/..."` |
| 📂 `type` | string | `"article"`, `"liveblog"`, `"video"` |
| 📂 `sectionId` | string | `"environment"` |
| 📂 `sectionName` | string | `"Environment"` |
| 📂 `pillarId` | string | `"pillar/news"` |
| 📂 `pillarName` | string | `"News"` |
| 📅 `webPublicationDate` | ISO 8601 | `"2026-05-14T18:00:00Z"` |
| 📅 `firstPublicationDate` | ISO 8601 | `"2026-05-14T17:30:00Z"` |
| 🕒 `lastModified` | ISO 8601 | `"2026-05-14T19:42:00Z"` |
| 🏢 `productionOffice` | string | `"UK"` |
| 🌍 `language` | string | `"en"` |
| 📰 `publication` | string | `"The Guardian"` |
| 👤 `byline` | string | `"Damian Carrington"` |
| 👤 `contributors` | array | `[{ "id": "...", "webTitle": "Damian Carrington" }]` |
| 🔢 `wordCount` | number | `812` |
| ⭐ `starRating` | number | null | `4` |
| 📺 `liveBloggingNow` | boolean | `false` |
| 📝 `standfirst` | string | `"Government's first climate review ..."` |
| 📝 `trailText` | string | trail snippet |
| 📝 `bodyText` | string | full article text |
| 📝 `bodyHtml` | string | full article HTML |
| 🏷️ `keywords` | array | `["environment/climate-change", "world/world"]` |
| 📦 `series` | array | `[{ "id": "...", "webTitle": "Climate countdown" }]` |
| 🏷️ `tones` | array | `[{ "id": "tone/news", "webTitle": "News" }]` |
| 📦 `imageGallery` | array | image asset URLs and captions |
| 🕒 `snapshotTime` | ISO 8601 | `"2026-05-15T00:00:00.000Z"` |

#### 📦 Sample records

<details>
<summary><strong>📰 News article: Environment section</strong></summary>

```json
{
    "imageUrl": "https://i.guim.co.uk/img/media/abcd/0_0_4500_2700/1000.jpg",
    "id": "environment/2026/may/14/climate-policy-review-uk",
    "webTitle": "Climate policy review prompts call for stronger action on emissions",
    "headline": "Climate policy review prompts call for stronger action on emissions",
    "webUrl": "https://www.theguardian.com/environment/2026/may/14/climate-policy-review-uk",
    "type": "article",
    "sectionId": "environment",
    "sectionName": "Environment",
    "pillarId": "pillar/news",
    "pillarName": "News",
    "webPublicationDate": "2026-05-14T18:00:00Z",
    "firstPublicationDate": "2026-05-14T17:30:00Z",
    "lastModified": "2026-05-14T19:42:00Z",
    "productionOffice": "UK",
    "language": "en",
    "publication": "The Guardian",
    "byline": "Damian Carrington",
    "contributors": [{ "id": "profile/damian-carrington", "webTitle": "Damian Carrington" }],
    "wordCount": 812,
    "starRating": null,
    "liveBloggingNow": false,
    "standfirst": "Government's first climate review prompts call for stronger emissions action.",
    "trailText": "Government's first climate review prompts call for stronger emissions action.",
    "keywords": ["environment/climate-change", "politics/politics", "world/world"],
    "series": [{ "id": "environment/series/climate-countdown", "webTitle": "Climate countdown" }],
    "tones": [{ "id": "tone/news", "webTitle": "News" }],
    "imageGallery": [],
    "snapshotTime": "2026-05-15T00:00:00.000Z"
}
```

</details>

<details>
<summary><strong>⭐ Film review: 4-star</strong></summary>

```json
{
    "imageUrl": "https://i.guim.co.uk/img/media/efgh/0_0_4000_2400/1000.jpg",
    "id": "film/2024/sep/05/the-substance-review",
    "webTitle": "The Substance review – body-horror satire delivers a 4-star punch",
    "headline": "The Substance review – body-horror satire delivers a 4-star punch",
    "webUrl": "https://www.theguardian.com/film/2024/sep/05/the-substance-review",
    "type": "review",
    "sectionId": "film",
    "sectionName": "Film",
    "pillarId": "pillar/arts",
    "pillarName": "Arts",
    "webPublicationDate": "2024-09-05T13:30:00Z",
    "firstPublicationDate": "2024-09-05T13:15:00Z",
    "lastModified": "2024-09-05T15:00:00Z",
    "productionOffice": "UK",
    "language": "en",
    "publication": "The Guardian",
    "byline": "Peter Bradshaw",
    "contributors": [{ "id": "profile/peterbradshaw", "webTitle": "Peter Bradshaw" }],
    "wordCount": 612,
    "starRating": 4,
    "liveBloggingNow": false,
    "standfirst": "Coralie Fargeat's body-horror satire is a deliciously gory, knowing skewering of beauty culture.",
    "trailText": "Coralie Fargeat's body-horror satire delivers gore and meaning in equal measure.",
    "keywords": ["film/film", "film/horror"],
    "series": [],
    "tones": [{ "id": "tone/reviews", "webTitle": "Reviews" }],
    "imageGallery": [],
    "snapshotTime": "2026-05-15T00:00:00.000Z"
}
```

</details>

<details>
<summary><strong>📺 Live blog: politics</strong></summary>

```json
{
    "imageUrl": "https://i.guim.co.uk/img/media/ijkl/0_0_4500_2700/1000.jpg",
    "id": "politics/live/2026/may/15/uk-politics-live-budget",
    "webTitle": "UK politics live: budget statement coverage",
    "headline": "UK politics live: budget statement coverage",
    "webUrl": "https://www.theguardian.com/politics/live/2026/may/15/uk-politics-live-budget",
    "type": "liveblog",
    "sectionId": "politics",
    "sectionName": "Politics",
    "pillarId": "pillar/news",
    "pillarName": "News",
    "webPublicationDate": "2026-05-15T08:00:00Z",
    "firstPublicationDate": "2026-05-15T07:55:00Z",
    "lastModified": "2026-05-15T18:00:00Z",
    "productionOffice": "UK",
    "language": "en",
    "publication": "The Guardian",
    "byline": "Andrew Sparrow",
    "contributors": [{ "id": "profile/andrewsparrow", "webTitle": "Andrew Sparrow" }],
    "wordCount": 4280,
    "starRating": null,
    "liveBloggingNow": true,
    "standfirst": "Live coverage of the chancellor's spring budget statement.",
    "trailText": "Live coverage of the chancellor's spring budget statement.",
    "keywords": ["politics/politics", "uk-news/uk-news"],
    "series": [{ "id": "politics/series/uk-politics-live", "webTitle": "UK politics live" }],
    "tones": [{ "id": "tone/minutebyminute", "webTitle": "Minute by minute" }],
    "imageGallery": [],
    "snapshotTime": "2026-05-15T00:00:00.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 📰 | **2.6M+ articles.** Full Guardian archive from 1999 onward. |
| 🔍 | **Boolean search.** AND, OR, NOT, and quoted phrases at the operator level. |
| 📂 | **32 sections.** From World and Politics to Sport, Culture, and Opinion. |
| ⭐ | **Star-rating filter.** Pull only top-rated film, TV, music, restaurant, and book reviews. |
| 📝 | **Full body text.** Body text and HTML are included by default; toggle off for lighter records. |
| 🌍 | **Multi-edition.** UK, US, Australian, and international production offices. |
| 🚫 | **No sign-up.** Works against the public Guardian content source. |

> 📊 The Guardian archive is among the most-used English-language corpora in NLP research and a frequent reference in journalism studies.

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| **⭐ Guardian Article Search Scraper** *(this Actor)* | $5 free credit, then pay-per-use | **2.6M+ articles** | **Live per run** | section, tag, date, office, language, star rating | ⚡ 2 min |
| Manual Guardian site search | Free | Same archive | Live | UI-only filters | 🚫 Not bulk-friendly |
| News-aggregator APIs | $99+/month | Multi-source | Live | Many | ⏳ Integration |
| Build your own scraper | Free time | Variable | Manual | None | 🐢 Days |

Pick this Actor when you want filtered, bulk Guardian data without writing a scraper or paying for a multi-source aggregator.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the Guardian Article Search Scraper page on the Apify Store.
3. 🎯 **Set input.** Enter a search query, optionally pick a section, tag, date range, and star rating.
4. 🚀 **Run it.** Click **Start** and let the Actor pull matching articles.
5. 📥 **Download.** Grab your results in the **Dataset** tab as CSV, Excel, JSON, or XML.

> ⏱️ Total time from signup to downloaded archive: **3-5 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 📊 Media Monitoring & PR

- Track brand mentions across the Guardian archive
- Monitor competitor coverage in Business and Tech
- Build alerts for crisis-comms triggers
- Audit narrative shifts on a topic over time

</td>
<td width="50%" valign="top">

#### 🔬 NLP & Data Science Research

- Build training corpora for sentiment models
- Generate topic-modeling datasets across decades
- Replicate published media-bias studies
- Train text-summarization models with full body text

</td>
</tr>
<tr>
<td width="50%" valign="top">

#### 📰 Journalism & Editorial

- Source background research with Boolean operators
- Build series timelines from tag-filtered archives
- Compare US, UK, and AU edition coverage
- Pull star-rated reviews for "best of" round-ups

</td>
<td width="50%" valign="top">

#### 🎯 Content Strategy & SEO

- Map topic coverage to identify white-space opportunities
- Benchmark headlines and standfirst length
- Audit byline and contributor mix on a beat
- Build content-trend dashboards

</td>
</tr>
</table>

***

### 🔌 Automating Guardian Article Search Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

- 🟢 **Node.js.** Install the `apify-client` NPM package.
- 🐍 **Python.** Use the `apify-client` PyPI package.
- 📚 See the [Apify documentation](https://docs.apify.com/api/v2) for full details.

The [Apify Schedules feature](https://docs.apify.com/platform/schedules) lets you trigger this Actor every hour for breaking-news monitoring or daily for editorial research.

***

### 🌟 Beyond business use cases

A high-quality news archive powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Train NLP models on multi-decade English-language corpora
- Replicate media-bias and framing studies
- Teach computational journalism with real archives
- Power dissertation research on long-running topics

</td>
<td width="50%">

#### 🎨 Personal and creative

- Build a personal "year in news" newsletter
- Render data art from headlines over time
- Curate a hobbyist film-review database by star rating
- Power a fan-site with structured author archives

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Provide free archival research to community newsrooms
- Audit coverage of a public-interest topic
- Track climate-coverage volume for advocacy briefs
- Inform civic dashboards on policy debates

</td>
<td width="50%">

#### 🧪 Experimentation

- Train LLM fine-tuning sets on long-form journalism
- Validate sentiment classifiers against tone tags
- Prototype a topic-trend visualizer
- Test AI summarizers on real article text

</td>
</tr>
</table>

***

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Guardian%20Article%20Search%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Guardian%20Article%20Search%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Guardian%20Article%20Search%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Guardian%20Article%20Search%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

***

### ❓ Frequently Asked Questions

#### 🧩 How does it work?

Enter a Boolean query, set optional filters (section, tag, date range, language, production office, star rating), and run. The Actor pulls matching articles from The Guardian and writes one clean record per article with full body text by default.

#### 📏 How accurate is the data?

Records mirror the official Guardian content source exactly. Headlines, bylines, and tags are pulled verbatim from each article. Body text is the full publication text without paywalled gating.

#### 🔁 How fresh is the archive?

Live. Each Actor run reflects the current state of The Guardian's content source, including just-published articles and live blogs.

#### 📅 How far back does coverage go?

The full archive is searchable from 1999 onward, with selective coverage for older material. Use <code>fromDate</code> and <code>toDate</code> to bound your query.

#### 🔍 What Boolean operators are supported?

AND, OR, NOT, and quoted phrases. For example: <code>"climate change" AND policy NOT denial</code>.

#### ⭐ How does the star-rating filter work?

Set <code>starRating</code> to 1-5 to filter reviews with at least that rating. Star ratings only appear on review-type articles in sections like Film, TV, Music, Books, and Food.

#### ⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run the Actor hourly for breaking-news monitoring or daily for archival research.

#### ⚖️ Is this data legal to use?

The Guardian publishes its content under open content terms via its developer programme. Verify your downstream use case against The Guardian's content licensing terms.

#### 💼 Can I use this data commercially?

Some commercial uses require additional licensing from The Guardian. Always review their content licensing terms before deploying in a paid product.

#### 💳 Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 articles per run). A paid plan lifts the limit and enables scheduling.

#### 🆘 What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.

***

### 🔌 Integrate with any app

Guardian Article Search Scraper connects to any cloud service via [Apify integrations](https://apify.com/integrations):

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate multi-step workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect with 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Push breaking-news alerts to channels
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Pipe article data into your warehouse
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger runs from commits and releases
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh climate articles into a research notebook, or alert a Slack channel when a brand mention surfaces.

***

### 🔗 Recommended Actors

- [**🚦 TfL London Live Status Scraper**](https://apify.com/parseforge/tfl-london-status-scraper) - Live London transport status and disruptions
- [**🌍 Carbon Intensity UK Scraper**](https://apify.com/parseforge/carbon-intensity-uk-scraper) - UK grid carbon intensity in gCO2/kWh
- [**🇬🇧 Hansard UK Debates Scraper**](https://apify.com/parseforge/hansard-uk-debates-scraper) - Search the UK Parliament debate record
- [**📰 BBC News Search Scraper**](https://apify.com/parseforge/bbc-news-search-scraper) - Search the BBC news archive
- [**📊 Federal Reserve H.15 Rates Scraper**](https://apify.com/parseforge/federalreserve-h15-rates-scraper) - U.S. Treasury yield-curve history

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more news and reference-data scrapers.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new scraper, propose a custom data project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Guardian News & Media or any of its affiliates. All trademarks mentioned are the property of their respective owners. Only publicly available content is collected.

# Actor input Schema

## `query` (type: `string`):

Free-text search across The Guardian's archive. Supports operators: AND, OR, NOT, quoted phrases. Leave empty to browse the latest articles.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `sectionCaptionFilters` (type: `string`):

Refine the search by section, tag, contributor, date, or production office.

## `section` (type: `string`):

Pick one section. Choose 'all' to search every section.

## `tag` (type: `string`):

Filter by tag (e.g. 'environment/climate-change', 'football/premierleague', 'profile/jonathan-freedland'). Use comma to combine.

## `fromDate` (type: `string`):

Earliest publication date (YYYY-MM-DD). Articles older than this are excluded.

## `toDate` (type: `string`):

Latest publication date (YYYY-MM-DD). Articles newer than this are excluded.

## `productionOffice` (type: `string`):

Edition responsible for the article: UK, US, AU, or international.

## `orderBy` (type: `string`):

Sort articles by newest, oldest, or relevance to the search query.

## `lang` (type: `string`):

Limit to articles in a specific language code (en, fr, es, de, ar, etc.).

## `starRating` (type: `integer`):

Only include reviews with this minimum star rating (1-5). Best for film, TV, music, restaurant reviews.

## `includeBodyText` (type: `boolean`):

Include the article's full body text and HTML in each record. Disable for lighter records.

## `includeBlocks` (type: `boolean`):

Include the structured 'blocks' object (liveblog updates, body blocks, embedded media) for each article. Useful for liveblog scraping.

## `includeReferences` (type: `boolean`):

Include the article's external reference IDs (ISBN, MusicBrainz, IMDb, etc.) when present.

## `includeRights` (type: `boolean`):

Include each article's syndication and reuse rights metadata.

## Actor input object example

```json
{
  "query": "",
  "maxItems": 10,
  "section": "all",
  "tag": "",
  "productionOffice": "any",
  "orderBy": "newest",
  "lang": "",
  "includeBodyText": true,
  "includeBlocks": false,
  "includeReferences": false,
  "includeRights": false
}
```

# Actor output Schema

## `overview` (type: `string`):

Overview of scraped data

## `fullData` (type: `string`):

Complete dataset

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10,
    "tag": "",
    "lang": ""
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/guardian-content-search-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 10,
    "tag": "",
    "lang": "",
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/guardian-content-search-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10,
  "tag": "",
  "lang": ""
}' |
apify call parseforge/guardian-content-search-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/guardian-content-search-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "The Guardian Article Search & Archive Scraper",
        "description": "Search The Guardian's full article archive (2.6M+ articles since 1999). Filter by query, section, tag, contributor, date, or production office. Returns headline, byline, body, tags, contributors, and publication metadata.",
        "version": "0.0",
        "x-build-id": "5Py51mci4jPtyBo98"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~guardian-content-search-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-guardian-content-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~guardian-content-search-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-guardian-content-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~guardian-content-search-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-guardian-content-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Free-text search across The Guardian's archive. Supports operators: AND, OR, NOT, quoted phrases. Leave empty to browse the latest articles.",
                        "default": ""
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "sectionCaptionFilters": {
                        "title": "Filters",
                        "type": "string",
                        "description": "Refine the search by section, tag, contributor, date, or production office."
                    },
                    "section": {
                        "title": "Section Filter",
                        "enum": [
                            "all",
                            "world",
                            "uk-news",
                            "us-news",
                            "australia-news",
                            "politics",
                            "business",
                            "money",
                            "technology",
                            "science",
                            "environment",
                            "sport",
                            "football",
                            "culture",
                            "books",
                            "music",
                            "film",
                            "tv-and-radio",
                            "stage",
                            "fashion",
                            "lifeandstyle",
                            "food",
                            "travel",
                            "media",
                            "education",
                            "society",
                            "law",
                            "global-development",
                            "opinion",
                            "commentisfree",
                            "artanddesign",
                            "games"
                        ],
                        "type": "string",
                        "description": "Pick one section. Choose 'all' to search every section.",
                        "default": "all"
                    },
                    "tag": {
                        "title": "Tag Filter",
                        "type": "string",
                        "description": "Filter by tag (e.g. 'environment/climate-change', 'football/premierleague', 'profile/jonathan-freedland'). Use comma to combine.",
                        "default": ""
                    },
                    "fromDate": {
                        "title": "From Date",
                        "type": "string",
                        "description": "Earliest publication date (YYYY-MM-DD). Articles older than this are excluded."
                    },
                    "toDate": {
                        "title": "To Date",
                        "type": "string",
                        "description": "Latest publication date (YYYY-MM-DD). Articles newer than this are excluded."
                    },
                    "productionOffice": {
                        "title": "Production Office",
                        "enum": [
                            "any",
                            "uk",
                            "us",
                            "aus",
                            "int"
                        ],
                        "type": "string",
                        "description": "Edition responsible for the article: UK, US, AU, or international.",
                        "default": "any"
                    },
                    "orderBy": {
                        "title": "Sort Order",
                        "enum": [
                            "newest",
                            "oldest",
                            "relevance"
                        ],
                        "type": "string",
                        "description": "Sort articles by newest, oldest, or relevance to the search query.",
                        "default": "newest"
                    },
                    "lang": {
                        "title": "Language",
                        "type": "string",
                        "description": "Limit to articles in a specific language code (en, fr, es, de, ar, etc.).",
                        "default": ""
                    },
                    "starRating": {
                        "title": "Minimum Star Rating",
                        "minimum": 1,
                        "maximum": 5,
                        "type": "integer",
                        "description": "Only include reviews with this minimum star rating (1-5). Best for film, TV, music, restaurant reviews."
                    },
                    "includeBodyText": {
                        "title": "Include Full Body Text",
                        "type": "boolean",
                        "description": "Include the article's full body text and HTML in each record. Disable for lighter records.",
                        "default": true
                    },
                    "includeBlocks": {
                        "title": "Include Liveblog Blocks",
                        "type": "boolean",
                        "description": "Include the structured 'blocks' object (liveblog updates, body blocks, embedded media) for each article. Useful for liveblog scraping.",
                        "default": false
                    },
                    "includeReferences": {
                        "title": "Include References",
                        "type": "boolean",
                        "description": "Include the article's external reference IDs (ISBN, MusicBrainz, IMDb, etc.) when present.",
                        "default": false
                    },
                    "includeRights": {
                        "title": "Include Syndication Rights",
                        "type": "boolean",
                        "description": "Include each article's syndication and reuse rights metadata.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
