# BOE Scraper — Spanish Official Gazette (`studio-amba/boe-scraper`) Actor

Extract laws, royal decrees, appointments, public notices, and procurement announcements from Spain's Boletin Oficial del Estado (BOE). Filter by date, section, department, or keyword. Returns structured data with direct links to PDF, HTML, and XML documents. No cookies, no login.

- **URL**: https://apify.com/studio-amba/boe-scraper.md
- **Developed by:** [Studio Amba](https://apify.com/studio-amba) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Spanish Official Gazette Scraper -- BOE Laws & Decrees

Extract laws, royal decrees, appointments, public procurement notices, and official announcements from Spain's Boletin Oficial del Estado (BOE). Filter by date range, section, department, or keyword. Returns structured data with direct links to PDF, HTML, and XML documents. No cookies, no login required.

### How to scrape BOE data

This actor queries the official BOE Open Data API to deliver structured Spanish gazette data. The BOE is Spain's mandatory publication channel for all legislation, government appointments, public procurement, and official notices. Published daily (Monday through Saturday), the BOE is the definitive legal record of the Spanish state.

#### Search by keyword

Search in Spanish for best results. Examples: "vivienda" (housing), "empleo" (employment), "impuesto" (tax), "energia" (energy), "educacion" (education). The keyword filter matches against document titles and headings.

#### Filter by section

The BOE is organized into sections:

| Section | Content |
|---------|---------|
| I | General provisions -- laws, royal decrees, ministerial orders |
| II.A | Appointments, promotions, transfers of civil servants |
| II.B | Public exams and competitive positions (oposiciones) |
| III | Other provisions -- subsidies, grants, regional regulations |
| IV | Administration of Justice |
| V.A | Public procurement announcements |
| V.B | Other official announcements |
| V.C | Private announcements |

#### Filter by department

Narrow results to a specific ministry or institution: "HACIENDA" (Finance), "DEFENSA" (Defense), "INTERIOR", "TRABAJO" (Labor), "EDUCACION" (Education). Partial match, case-insensitive.

#### Date range monitoring

Set a date range to monitor specific periods. The BOE publishes Monday through Saturday (no Sunday edition). Perfect for daily or weekly monitoring of new regulations.

### What data does BOE Scraper extract?

| Field | Type | Description |
|-------|------|-------------|
| **identificador** | String | BOE document ID (e.g., BOE-A-2026-10986) |
| **titulo** | String | Full document title in Spanish |
| **seccion** | String | BOE section name |
| **departamento** | String | Issuing department or ministry |
| **epigrafe** | String | Subject heading (e.g., "Medidas urgentes") |
| **fechaPublicacion** | String | Publication date (YYYY-MM-DD) |
| **numeroDiario** | String | Daily issue number |
| **urlPdf** | String | Direct PDF download link |
| **urlHtml** | String | HTML version link |
| **urlXml** | String | XML version link |
| **paginaInicial** | String | Starting page in the printed edition |
| **paginaFinal** | String | Ending page in the printed edition |
| **control** | String | Internal reference number |
| **url** | String | Direct link to BOE document |
| **scrapedAt** | String | ISO timestamp of extraction |

### Example output

```json
{
    "identificador": "BOE-A-2026-10986",
    "titulo": "Resolucion de 20 de mayo de 2026, del Congreso de los Diputados, por la que se ordena la publicacion del Acuerdo de convalidacion del Real Decreto-ley 10/2026",
    "seccion": "I. Disposiciones generales",
    "departamento": "CORTES GENERALES",
    "epigrafe": "Medidas urgentes",
    "fechaPublicacion": "2026-05-22",
    "numeroDiario": "125",
    "urlPdf": "https://www.boe.es/boe/dias/2026/05/22/pdfs/BOE-A-2026-10986.pdf",
    "urlHtml": "https://www.boe.es/diario_boe/txt.php?id=BOE-A-2026-10986",
    "urlXml": "https://www.boe.es/diario_boe/xml.php?id=BOE-A-2026-10986",
    "paginaInicial": "69602",
    "paginaFinal": "69602",
    "control": "2026/8251",
    "url": "https://www.boe.es/diario_boe/txt.php?id=BOE-A-2026-10986",
    "scrapedAt": "2026-05-23T10:00:00.000Z"
}
````

### Use cases

- **Legal compliance monitoring** -- Track new regulations affecting your industry by section and keyword.
- **Public procurement** -- Monitor section V.A for new government contract opportunities.
- **HR and recruitment** -- Follow section II.B for public exam announcements (oposiciones).
- **Academic research** -- Build a corpus of Spanish legislation over time.
- **Government affairs** -- Track ministerial appointments and institutional changes.

### Tips for best results

- **Daily monitoring**: Set dateFrom and dateTo to the same date for a single day's gazette.
- **Multi-day scan**: Use a date range to catch up on recent publications.
- **Procurement focus**: Set section to "5A" to only see public procurement announcements.
- **Scheduled runs**: Run daily to build a comprehensive database of Spanish legislation.

### How much does it cost?

| Search size | Estimated time | Estimated cost |
|-------------|---------------|----------------|
| 1 day (20-80 items) | ~3 seconds | ~$0.002 |
| 1 week (~200 items) | ~10 seconds | ~$0.008 |
| 1 month (~1,000 items) | ~1 minute | ~$0.05 |

### Can I use it as an API?

```python
from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("studio-amba/boe-scraper").call(run_input={
    "searchQuery": "vivienda",
    "section": "1",
    "dateFrom": "2026-05-01",
    "dateTo": "2026-05-22",
    "maxResults": 50,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['identificador']} | {item['titulo'][:80]}")
```

### Limitations

- The BOE API returns summaries by date. Keyword filtering is done client-side after fetching.
- No Sunday editions -- the BOE is not published on Sundays.
- Very large date ranges (months) will make many API calls; use maxResults to cap output.

### Other Spanish data scrapers

- **[Subastas BOE Scraper](https://apify.com/studio-amba/subastas-boe-scraper)** -- Spanish government & judicial auctions
- **[BOAMP Scraper](https://apify.com/studio-amba/boamp-scraper)** -- French public procurement tenders

### Your feedback

Found a bug or want a feature? Open an issue on the [Issues tab](https://console.apify.com/actors/studio-amba~boe-scraper/issues).

# Actor input Schema

## `searchQuery` (type: `string`):

Filter results by keyword in title. Search in Spanish for best results. Example: 'vivienda', 'empleo', 'impuesto'.

## `section` (type: `string`):

Filter by BOE section.

## `department` (type: `string`):

Filter by department name (partial match). Example: 'HACIENDA', 'DEFENSA', 'INTERIOR'.

## `dateFrom` (type: `string`):

Only items published on or after this date. Format: YYYY-MM-DD. Defaults to today.

## `dateTo` (type: `string`):

Only items published on or before this date. Format: YYYY-MM-DD. Defaults to today.

## `maxResults` (type: `integer`):

Maximum number of items to return.

## `proxyConfiguration` (type: `object`):

Select proxies to use for the scraper.

## Actor input object example

```json
{
  "searchQuery": "vivienda",
  "section": "all",
  "dateFrom": "2026-05-22",
  "maxResults": 20,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "ES"
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQuery": "vivienda",
    "dateFrom": "2026-05-22",
    "maxResults": 20,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ],
        "apifyProxyCountry": "ES"
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("studio-amba/boe-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQuery": "vivienda",
    "dateFrom": "2026-05-22",
    "maxResults": 20,
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
        "apifyProxyCountry": "ES",
    },
}

# Run the Actor and wait for it to finish
run = client.actor("studio-amba/boe-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQuery": "vivienda",
  "dateFrom": "2026-05-22",
  "maxResults": 20,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "ES"
  }
}' |
apify call studio-amba/boe-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=studio-amba/boe-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "BOE Scraper — Spanish Official Gazette",
        "description": "Extract laws, royal decrees, appointments, public notices, and procurement announcements from Spain's Boletin Oficial del Estado (BOE). Filter by date, section, department, or keyword. Returns structured data with direct links to PDF, HTML, and XML documents. No cookies, no login.",
        "version": "0.0",
        "x-build-id": "EIHMUGaUmVYcos0Q3"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/studio-amba~boe-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-studio-amba-boe-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/studio-amba~boe-scraper/runs": {
            "post": {
                "operationId": "runs-sync-studio-amba-boe-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/studio-amba~boe-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-studio-amba-boe-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchQuery": {
                        "title": "Keywords",
                        "type": "string",
                        "description": "Filter results by keyword in title. Search in Spanish for best results. Example: 'vivienda', 'empleo', 'impuesto'."
                    },
                    "section": {
                        "title": "BOE Section",
                        "enum": [
                            "all",
                            "1",
                            "2A",
                            "2B",
                            "3",
                            "4",
                            "5A",
                            "5B",
                            "5C",
                            "T"
                        ],
                        "type": "string",
                        "description": "Filter by BOE section.",
                        "default": "all"
                    },
                    "department": {
                        "title": "Department",
                        "type": "string",
                        "description": "Filter by department name (partial match). Example: 'HACIENDA', 'DEFENSA', 'INTERIOR'."
                    },
                    "dateFrom": {
                        "title": "Published After",
                        "type": "string",
                        "description": "Only items published on or after this date. Format: YYYY-MM-DD. Defaults to today."
                    },
                    "dateTo": {
                        "title": "Published Before",
                        "type": "string",
                        "description": "Only items published on or before this date. Format: YYYY-MM-DD. Defaults to today."
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 1,
                        "maximum": 50000,
                        "type": "integer",
                        "description": "Maximum number of items to return.",
                        "default": 100
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Select proxies to use for the scraper.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ],
                            "apifyProxyCountry": "ES"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
