# Wikiquote Scraper (`devilscrapes/wikiquote-quotes-scraper`) Actor

Extract quotes from any Wikiquote page — by person, work, or topic — via the Wikiquote MediaWiki API. Returns each quote with attribution, source work, year, and language, exported to JSON or CSV. Free, multilingual.

- **URL**: https://apify.com/devilscrapes/wikiquote-quotes-scraper.md
- **Developed by:** [DevilScrapes](https://apify.com/devilscrapes) (community)
- **Categories:** AI, News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

<div align="center">
  <img src=".actor/icon.svg" width="160" alt="Devil Scrapes mark" />

## Wikiquote Scraper — Attributed Quotes to JSON

**💰 $1.00 / 1 000 results** &nbsp;·&nbsp; pay only for results &nbsp;·&nbsp; no credit card to try

_We do the dirty work so your dataset stays clean._ 😈

Extract properly-attributed quotes from any Wikiquote page — by author, work, or topic — and receive each quote with its source text, year, section, and language code. Multilingual, citation-real, and PPE-priced.

</div>

---

### 🎯 What this scrapes

Wikiquote is the world's largest community-edited quotes library — a sister project of Wikipedia with strict citation requirements. This Actor accepts a list of Wikiquote article titles (or full URLs) and writes one dataset row per quote, with full attribution metadata and — when the page supplies it — the source work and year.

Works across every Wikiquote language subdomain: pass `language: "de"` and get German-language quotes. Unsourced, disputed, and misattributed sections are labelled separately in the `section` field so you can filter on attribution quality rather than guessing.

### 🔥 Features

- 🛡️ **Browser fingerprint rotation** — `curl-cffi` replays real Chrome / Firefox / Safari TLS handshakes so the target sees a real browser, not a Python script.
- 🌐 **Residential proxy rotation** via Apify Proxy — fresh session and exit IP on every block signal.
- 🔁 **Retries with exponential backoff** on `408 / 429 / 5xx` — up to 5 attempts per page, `Retry-After` honoured.
- 🧱 **Rate-limit-aware pacing** — when the target pushes back, we slow down and wait rather than triggering a ban.
- 🧊 **Clean, typed dataset rows** — Pydantic-validated, ISO-8601 timestamps, stable IDs, export-ready as JSON / CSV / Excel from the Apify Console.
- 💰 **Pay-Per-Event pricing** — you pay only when a result lands in your dataset. No data, no charge.

### 💡 Use cases

- **Daily-quote service** — schedule a run for a curated list and push one quote per day to your app or newsletter.
- **Citation enrichment** — find a properly sourced quote when you have only the speaker's name.
- **Multilingual analysis** — pull quotes on the same topic across 5 or more language editions.
- **Movie / book reference assembly** — extract every quote from a film or novel's Wikiquote page for a study guide or quiz app.
- **Attribution-real RAG corpus** — small, clean, citation-grounded text for LLM retrieval demos where hallucinated attributions are unacceptable.
- **Education / language-learning apps** — real sourced quotes in the target language, with section labels for difficulty filtering.

### ⚙️ How to use it

1. Click **Try for free** at the top of the Store page.
2. Enter your list of Wikiquote article titles — use the exact name as shown on the Wikiquote page (e.g. `Albert Einstein`, `The Dark Knight`).
3. Set your `language` code if you need a non-English edition.
4. Click **Start**. Results stream into the run's dataset in real time.
5. Export from **Storage → Dataset** as JSON, CSV, or Excel — or pull via the Apify dataset API.

### 📥 Input

| Field | Type | Required | Default | Notes |
|---|---|:--:|---|---|
| `pages` | `array` | **yes** | `["Albert Einstein", "Oscar Wilde"]` | List of Wikiquote article titles or full page URLs. Use the exact title as shown on Wikiquote. |
| `language` | `string` | no | `"en"` | Wikiquote subdomain ISO code (`en`, `de`, `fr`, `es`, etc.). |
| `maxQuotesPerPage` | `integer` | no | `50` | Cap on quotes extracted per page. Some pages have hundreds; default keeps cost predictable. |
| `proxyConfiguration` | `object` | no | `{"useApifyProxy": false}` | Wikiquote serves programmatic clients. Proxy is optional but available if needed. |

#### Example input

```json
{
  "pages": [
    "Albert Einstein",
    "Marcus Aurelius"
  ],
  "language": "en",
  "maxQuotesPerPage": 50,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
````

### 📤 Output

Every row is one dataset item.

| Field | Type | Notes |
|---|---|---|
| `quote` | `string` | The quote text, plain string. |
| `attribution` | `string` | Who the quote is attributed to (the Wikiquote page title). |
| `source` | `string \| null` | Source work or context when Wikiquote supplies it. |
| `year` | `string \| null` | Year of the quote when detectable from the page. |
| `section` | `string \| null` | Section heading the quote appeared under (e.g. `"Sourced"`, `"Disputed"`, `"Misattributed"`). |
| `page_url` | `string` | Full URL of the source Wikiquote page. |
| `language` | `string` | Wikiquote language code used for this page. |
| `scraped_at` | `string` | ISO-8601 timestamp when this row was recorded. |

#### Example output

```json
{
  "quote": "Everything should be made as simple as possible, but not simpler.",
  "attribution": "Albert Einstein",
  "source": "Reader's Digest, October 1977",
  "year": "1933",
  "section": "Sourced",
  "page_url": "https://en.wikiquote.org/wiki/Albert_Einstein",
  "language": "en",
  "scraped_at": "2026-06-01T09:00:00Z"
}
```

### 💰 Pricing

Pay-Per-Event — you pay only when these events fire:

| Event | USD | What it is |
|---|---:|---|
| `actor-start` | $0.005 | One-off warm-up charge per run |
| `result` | $0.001 | Per dataset item pushed |

Example: 1 000 results ≈ **$1.00**. No subscription, no minimum, no card to start — Apify gives every new account $5 of free credit.

### 🚧 Limitations

- We parse the HTML rendered by Wikiquote's MediaWiki engine. Pages that use unusual templates may surface a quote without a source or year — the `section` field tells you whether it came from a verified, attributed, disputed, or misattributed section.
- Themed list pages (`List of quotes about X`) are supported, but the `attribution` field will be the page title rather than an individual speaker.
- `Category:` pages (index pages listing many articles) are not yet supported — pass individual article titles. Category enumeration is on the roadmap.
- Wikiquote markup varies across language editions; rare edge-case pages may parse with reduced fidelity. We surface the `section` label so you can filter on quality.

### ❓ FAQ

**Is this legal?**

Yes. Wikiquote content is published under the CC BY-SA licence. Attribute the source when reusing quotes in a commercial product.

**Is there a Wikiquote API I can use directly instead?**

Wikiquote exposes the standard MediaWiki API, but it returns raw WikiText — a brittle, per-page markup that varies wildly across language editions and page authors. Parsing it correctly requires handling dozens of template variants, nested sections, and unsourced markers. This Actor absorbs that complexity so you receive clean, structured JSON rows.

**Can I use this as a free `famous quotes API`?**

Yes — for attribution-real, citation-grounded quotes it's the best free option. Wikiquote is the only community-edited source that requires citations; random "famous quotes" APIs typically contain hallucinated or misattributed text. Export your results as JSON, host them behind a Cloudflare Worker, and you have a `GET /random` endpoint backed by real sources.

**Why are some quotes missing a source?**

Wikiquote contributors don't always supply a citation. The `section` field tells you which attribution tier the quote is in — filter to `section: "Sourced"` for citation-confirmed quotes only.

**What if a page is huge?**

Use `maxQuotesPerPage` to cap output. Some pages have hundreds of quotes; the default of 50 keeps cost predictable. Remove the cap if you need the full page.

**Do you support multilingual pages?**

Yes. Set `language` to any ISO code with a Wikiquote subdomain — `"de"`, `"fr"`, `"es"`, `"pt"`, `"it"`, `"ru"`, and many more. Each language edition is a separate subdomain with its own article set.

**Do you support `Category:` pages?**

Not yet — pass individual article titles. Category enumeration is on the roadmap.

### 💬 Your feedback

Spotted a bug, hit a weird parse edge case, or need a new field? Open an issue on the Actor's **Issues** tab in the Apify Console — we ship fixes weekly and we read every report.

***

<div align="center">

Built by **[Devil Scrapes](https://apify.com/DevilScrapes)** 😈 — a small fleet of
opinionated public-data Actors. Honest pricing, real engineering, zero fine print.

</div>

# Actor input Schema

## `pages` (type: `array`):

List of Wikiquote article titles (e.g. <code>Albert Einstein</code>) or full URLs. Use the exact title as shown on Wikiquote.

## `language` (type: `string`):

Wikiquote subdomain ISO code (<code>en</code>, <code>de</code>, <code>fr</code>, etc.).

## `maxQuotesPerPage` (type: `integer`):

Cap on quotes extracted from a single Wikiquote page.

## `proxyConfiguration` (type: `object`):

Wikiquote is open to programmatic clients. Proxy optional.

## Actor input object example

```json
{
  "pages": [
    "Albert Einstein",
    "Oscar Wilde"
  ],
  "language": "en",
  "maxQuotesPerPage": 50,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `datasetItems` (type: `string`):

All dataset items as JSON.

## `datasetItemsCsv` (type: `string`):

Same data exported to CSV.

## `datasetView` (type: `string`):

Open the run dataset in the Console.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "pages": [
        "Albert Einstein",
        "Oscar Wilde"
    ],
    "proxyConfiguration": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("devilscrapes/wikiquote-quotes-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "pages": [
        "Albert Einstein",
        "Oscar Wilde",
    ],
    "proxyConfiguration": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("devilscrapes/wikiquote-quotes-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "pages": [
    "Albert Einstein",
    "Oscar Wilde"
  ],
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}' |
apify call devilscrapes/wikiquote-quotes-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=devilscrapes/wikiquote-quotes-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Wikiquote Scraper",
        "description": "Extract quotes from any Wikiquote page — by person, work, or topic — via the Wikiquote MediaWiki API. Returns each quote with attribution, source work, year, and language, exported to JSON or CSV. Free, multilingual.",
        "version": "0.4",
        "x-build-id": "9CJiYaA3Kp8Ar7s2A"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/devilscrapes~wikiquote-quotes-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-devilscrapes-wikiquote-quotes-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/devilscrapes~wikiquote-quotes-scraper/runs": {
            "post": {
                "operationId": "runs-sync-devilscrapes-wikiquote-quotes-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/devilscrapes~wikiquote-quotes-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-devilscrapes-wikiquote-quotes-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "pages"
                ],
                "properties": {
                    "pages": {
                        "title": "Wikiquote page titles or URLs",
                        "type": "array",
                        "description": "List of Wikiquote article titles (e.g. <code>Albert Einstein</code>) or full URLs. Use the exact title as shown on Wikiquote.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "language": {
                        "title": "Language code",
                        "type": "string",
                        "description": "Wikiquote subdomain ISO code (<code>en</code>, <code>de</code>, <code>fr</code>, etc.).",
                        "default": "en"
                    },
                    "maxQuotesPerPage": {
                        "title": "Max quotes per page",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Cap on quotes extracted from a single Wikiquote page.",
                        "default": 50
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Wikiquote is open to programmatic clients. Proxy optional.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
