# Wikipedia Article Extractor (`glassventures/wikipedia-article-extractor`) Actor

Extract Wikipedia articles via MediaWiki API. Get full text, summaries, sections, categories, images, links. Multi-language. Perfect for AI/ML training data and RAG.

- **URL**: https://apify.com/glassventures/wikipedia-article-extractor.md
- **Developed by:** [Glass Ventures](https://apify.com/glassventures) (community)
- **Categories:** AI, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Wikipedia Article Extractor

Extract structured data from any Wikipedia article using the official MediaWiki API. Get full article text, summaries, sections, categories, images, links, and metadata in any of 12+ languages.

### What does Wikipedia Article Extractor do?

Wikipedia Article Extractor lets you pull clean, structured data from Wikipedia at scale. Unlike web scrapers that parse HTML, this actor uses the official MediaWiki API -- the same API that powers Wikipedia itself. This means you get reliable, well-formatted plain text without HTML artifacts, broken layouts, or anti-bot issues.

The actor supports four flexible input methods: direct article URLs, article titles, search queries, and category names. You can mix and match these to build exactly the dataset you need. Need every article in the "Machine learning" category? Just enter the category name. Want specific articles in Japanese? Switch the language and enter titles.

This is the ideal tool for AI/ML engineers building training datasets, researchers collecting knowledge bases, and developers building RAG (Retrieval-Augmented Generation) pipelines. The output is clean plain text optimized for LLM consumption.

### Use Cases

- **AI/ML engineers** -- Build high-quality training datasets from Wikipedia's vast knowledge base. Clean plain text output is ready for tokenization.
- **RAG pipeline developers** -- Extract structured articles with sections for chunk-based retrieval in vector databases.
- **Researchers** -- Collect articles on specific topics or entire categories for academic analysis, NLP research, or corpus building.
- **Content creators** -- Research topics with comprehensive summaries, section breakdowns, and reference counts.
- **SEO professionals** -- Analyze Wikipedia content structure, internal linking patterns, and category relationships.
- **Fact-checkers** -- Quickly pull article text, references counts, and last-modified dates for verification workflows.
- **Knowledge base builders** -- Create structured knowledge bases from Wikipedia categories with full metadata.

### Features

- **4 input methods**: URLs, article titles, search terms, category names -- or combine them all
- **Official MediaWiki API**: No scraping needed. Reliable, fast, and respects Wikipedia's infrastructure
- **12+ languages**: English, Spanish, French, German, Japanese, Portuguese, Italian, Russian, Chinese, Korean, Arabic, Hindi
- **AI-friendly output**: Clean plain text perfect for LLM training data, RAG pipelines, and NLP tasks
- **Rich metadata**: Word count, reference count, last modified date, page ID, categories
- **Structured sections**: Article broken down by heading with hierarchy levels
- **Batch processing**: Extract hundreds of articles in a single run
- **Category crawling**: Automatically fetch all articles from a Wikipedia category
- **No proxy required**: Wikipedia API is public and generous with rate limits
- **Exports to JSON, CSV, Excel, or connect via API**

### How much will it cost?

Wikipedia Article Extractor is **free to use** -- you only pay for Apify platform compute time, which is minimal since the actor uses the lightweight MediaWiki API (no browser needed).

| Articles | Estimated Cost | Time |
|----------|---------------|------|
| 100      | ~$0.01        | ~1 min |
| 1,000    | ~$0.05        | ~5 min |
| 10,000   | ~$0.50        | ~30 min |

| Cost Component | Per 1,000 Articles |
|----------------|-------------------|
| Platform compute (256 MB) | ~$0.05 |
| Proxy (optional) | $0.00 |
| **Total** | **~$0.05** |

### How to use

1. Go to the Wikipedia Article Extractor page on Apify Store
2. Click "Start" or "Try for free"
3. Enter article URLs, titles, search terms, or category names
4. Select the Wikipedia language edition
5. Choose what data to include (full text, sections, categories, etc.)
6. Set the maximum number of articles
7. Click "Start" and wait for the results

#### Multi-language examples

Extract articles in different languages:

- **English**: Enter title "Artificial intelligence" with language "en"
- **Spanish**: Enter title "Inteligencia artificial" with language "es"
- **Japanese**: Enter title "人工知能" with language "ja"
- **German**: Enter title "Kunstliche Intelligenz" with language "de"

Or use URLs directly -- the language is auto-detected:
- `https://fr.wikipedia.org/wiki/Intelligence_artificielle`
- `https://zh.wikipedia.org/wiki/人工智能`

### Input parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| startUrls | array | Direct Wikipedia article URLs | - |
| articleTitles | array | Article titles (e.g., "Albert Einstein") | - |
| searchTerms | array | Search queries to find articles | - |
| categories | array | Category names to extract all articles from | - |
| language | string | Wikipedia language edition (en, es, fr, de, ja, etc.) | en |
| includeFullText | boolean | Extract complete article text | true |
| includeSections | boolean | Extract sections with headings | true |
| includeCategories | boolean | Extract article categories | true |
| includeLinks | boolean | Extract internal Wikipedia links | false |
| includeImages | boolean | Extract image URLs | false |
| maxItems | number | Maximum articles to extract (0 = unlimited) | 100 |
| proxyConfig | object | Optional proxy settings | - |

### Output

The actor produces a dataset with the following fields:

```json
{
    "url": "https://en.wikipedia.org/wiki/Web_scraping",
    "title": "Web scraping",
    "pageId": 2696619,
    "language": "en",
    "summary": "Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites...",
    "fullText": "Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser...",
    "sections": [
        {
            "heading": "Introduction",
            "text": "Web scraping, web harvesting, or web data extraction...",
            "level": 1
        },
        {
            "heading": "Techniques",
            "text": "Web scraping is the process of automatically mining data...",
            "level": 2
        }
    ],
    "categories": [
        "Web scraping",
        "Data mining",
        "Web technology"
    ],
    "links": ["Data scraping", "Website", "Hypertext Transfer Protocol"],
    "images": ["https://commons.wikimedia.org/wiki/Special:FilePath/Example.png"],
    "lastModified": "2024-12-01T15:30:00Z",
    "wordCount": 4523,
    "referencesCount": 87,
    "scrapedAt": "2025-01-15T10:30:00.000Z"
}
````

| Field | Type | Description |
|-------|------|-------------|
| url | string | Wikipedia article URL |
| title | string | Article title |
| pageId | integer | Wikipedia internal page ID |
| language | string | Language code (en, es, fr, etc.) |
| summary | string | Article introduction/summary in plain text |
| fullText | string | Complete article text in plain text |
| sections | array | Sections with heading, text, and level |
| categories | array | Article categories |
| links | array | Internal Wikipedia links |
| images | array | Image URLs from Wikimedia Commons |
| lastModified | string | Last edit timestamp |
| wordCount | integer | Total word count |
| referencesCount | integer | Number of citations/references |
| scrapedAt | string | ISO 8601 extraction timestamp |

### How it works -- MediaWiki API

This actor uses the official [MediaWiki API](https://www.mediawiki.org/wiki/API:Main_page), which is the same API that powers Wikipedia's own interface, mobile apps, and third-party tools. Key endpoints used:

- **`action=query&prop=extracts`** -- Retrieves article text as clean plain text (no HTML)
- **`action=query&prop=categories|links|images`** -- Fetches article metadata
- **`action=parse&prop=sections|wikitext`** -- Parses article structure and raw wikitext
- **`action=query&list=search`** -- Searches for articles by keyword
- **`action=query&list=categorymembers`** -- Lists all articles in a category

The MediaWiki API is public, free, and does not require authentication. It has generous rate limits and is the most reliable way to access Wikipedia data.

### Integrations

Connect Wikipedia Article Extractor with other tools:

- **Apify API** -- REST API for programmatic access
- **Webhooks** -- Get notified when a run finishes
- **Zapier / Make** -- Connect to 5,000+ apps
- **Google Sheets** -- Export directly to spreadsheets
- **Vector databases** -- Feed extracted text into Pinecone, Weaviate, Qdrant for RAG

#### API Example (Node.js)

```javascript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('YOUR_USERNAME/wikipedia-article-extractor').call({
    articleTitles: ['Artificial intelligence', 'Machine learning', 'Deep learning'],
    language: 'en',
    includeFullText: true,
    maxItems: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Extracted ${items.length} articles`);
```

#### API Example (Python)

```python
from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('YOUR_USERNAME/wikipedia-article-extractor').call(run_input={
    'articleTitles': ['Artificial intelligence', 'Machine learning', 'Deep learning'],
    'language': 'en',
    'includeFullText': True,
    'maxItems': 100,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(f'Extracted {len(items)} articles')
```

#### API Example (cURL)

```bash
curl "https://api.apify.com/v2/acts/YOUR_USERNAME~wikipedia-article-extractor/runs" \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "articleTitles": ["Artificial intelligence", "Machine learning"],
    "language": "en",
    "includeFullText": true,
    "maxItems": 100
  }'
```

### Tips and tricks

- Start with a small `maxItems` (5-10) to test your configuration before running large extractions
- Use **article titles** for best reliability -- they map directly to the API with no ambiguity
- **Category extraction** is powerful: a single category like "Machine learning" can yield hundreds of articles
- Combine input methods: search for a topic, then extract entire categories found in the results
- For **AI training data**, enable `includeFullText` and disable `includeLinks` and `includeImages` for clean text output
- For **RAG pipelines**, enable `includeSections` to get pre-chunked content with headings
- Wikipedia URLs auto-detect language, so you can mix English and French URLs in the same run
- **No proxy needed** for most use cases -- the MediaWiki API is public and generous with rate limits

### FAQ

**Q: Does this actor require login credentials?**
A: No. The MediaWiki API is completely public and free to use. No authentication needed.

**Q: How fast is the extraction?**
A: Approximately 100-200 articles per minute depending on article size and data options selected. The actor makes multiple API calls per article (text, metadata, sections).

**Q: Can I extract articles in any language?**
A: The UI offers 12 popular languages, but you can use any Wikipedia language by providing URLs directly (e.g., `https://sv.wikipedia.org/wiki/...` for Swedish).

**Q: What about rate limits?**
A: Wikipedia's API has generous rate limits. For very large extractions (10,000+ articles), the actor automatically paces requests. You can optionally configure a proxy to distribute requests.

**Q: Can I extract talk pages or user pages?**
A: This actor is optimized for article (main namespace) pages. Talk pages and other namespaces may work via direct URLs but are not officially supported.

**Q: Is the output suitable for LLM training?**
A: Yes. The plain text output is clean, well-structured, and free of HTML artifacts. It is ideal for tokenization and training.

### Is it legal to extract data from Wikipedia?

Wikipedia content is released under the [Creative Commons Attribution-ShareAlike License](https://en.wikipedia.org/wiki/Wikipedia:Text_of_the_Creative_Commons_Attribution-ShareAlike_4.0_International_License) and the [GNU Free Documentation License](https://en.wikipedia.org/wiki/Wikipedia:Text_of_the_GNU_Free_Documentation_License). This means you are free to share and adapt Wikipedia content, even for commercial purposes, as long as you provide attribution and share derivatives under the same license.

The MediaWiki API is the officially supported way to programmatically access Wikipedia data. Wikipedia actively encourages bulk data access through its API and database dumps. For more information, see [Apify's blog on web scraping legality](https://blog.apify.com/is-web-scraping-legal/).

### Limitations

- Article text is plain text only (no HTML formatting, tables, or mathematical formulas)
- Infobox data is not extracted as structured key-value pairs (raw wikitext can be complex)
- Maximum of ~500 category members per category in a single pagination cycle
- Very large articles (100,000+ words) may take longer to process
- Search results are limited to 50 per query (Wikipedia API limit)

### Changelog

- **v0.1** (2026-04-23) -- Initial release with URL, title, search, and category input methods. Multi-language support. Full text, sections, categories, links, images, and metadata extraction.

# Actor input Schema

## `startUrls` (type: `array`):

Direct URLs to Wikipedia articles. Supports any language Wikipedia.

## `articleTitles` (type: `array`):

Wikipedia article titles to extract (e.g., "Albert Einstein", "Machine learning"). Uses the selected language.

## `searchTerms` (type: `array`):

Search Wikipedia for articles matching these terms. Returns the top results for each query.

## `categories` (type: `array`):

Extract all articles from Wikipedia categories (e.g., "Machine learning", "Natural language processing"). Don't include the "Category:" prefix.

## `language` (type: `string`):

Wikipedia language edition to use.

## `includeFullText` (type: `boolean`):

Extract the complete article text in plain text format. Ideal for AI/ML training data and RAG pipelines.

## `includeSections` (type: `boolean`):

Extract article sections with headings and text. Useful for structured content analysis.

## `includeCategories` (type: `boolean`):

Extract the list of categories the article belongs to.

## `includeLinks` (type: `boolean`):

Extract internal Wikipedia links from the article. Can produce large output.

## `includeImages` (type: `boolean`):

Extract image URLs from the article via Wikimedia Commons.

## `maxItems` (type: `integer`):

Maximum number of articles to extract. Use 0 for unlimited.

## `proxyConfig` (type: `object`):

Proxy settings. Wikipedia API is public so proxies are optional, but can help avoid rate limits at scale.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://en.wikipedia.org/wiki/Web_scraping"
    },
    {
      "url": "https://en.wikipedia.org/wiki/Artificial_intelligence"
    }
  ],
  "articleTitles": [
    "Web scraping",
    "Artificial intelligence"
  ],
  "language": "en",
  "includeFullText": true,
  "includeSections": true,
  "includeCategories": true,
  "includeLinks": false,
  "includeImages": false,
  "maxItems": 100
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://en.wikipedia.org/wiki/Web_scraping"
        },
        {
            "url": "https://en.wikipedia.org/wiki/Artificial_intelligence"
        }
    ],
    "articleTitles": [
        "Web scraping",
        "Artificial intelligence"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("glassventures/wikipedia-article-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [
        { "url": "https://en.wikipedia.org/wiki/Web_scraping" },
        { "url": "https://en.wikipedia.org/wiki/Artificial_intelligence" },
    ],
    "articleTitles": [
        "Web scraping",
        "Artificial intelligence",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("glassventures/wikipedia-article-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://en.wikipedia.org/wiki/Web_scraping"
    },
    {
      "url": "https://en.wikipedia.org/wiki/Artificial_intelligence"
    }
  ],
  "articleTitles": [
    "Web scraping",
    "Artificial intelligence"
  ]
}' |
apify call glassventures/wikipedia-article-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=glassventures/wikipedia-article-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Wikipedia Article Extractor",
        "description": "Extract Wikipedia articles via MediaWiki API. Get full text, summaries, sections, categories, images, links. Multi-language. Perfect for AI/ML training data and RAG.",
        "version": "0.1",
        "x-build-id": "YtXv5EeUHuouPEB8o"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/glassventures~wikipedia-article-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-glassventures-wikipedia-article-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/glassventures~wikipedia-article-extractor/runs": {
            "post": {
                "operationId": "runs-sync-glassventures-wikipedia-article-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/glassventures~wikipedia-article-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-glassventures-wikipedia-article-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Wikipedia URLs",
                        "type": "array",
                        "description": "Direct URLs to Wikipedia articles. Supports any language Wikipedia.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "articleTitles": {
                        "title": "Article Titles",
                        "type": "array",
                        "description": "Wikipedia article titles to extract (e.g., \"Albert Einstein\", \"Machine learning\"). Uses the selected language.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchTerms": {
                        "title": "Search Terms",
                        "type": "array",
                        "description": "Search Wikipedia for articles matching these terms. Returns the top results for each query.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "categories": {
                        "title": "Category Names",
                        "type": "array",
                        "description": "Extract all articles from Wikipedia categories (e.g., \"Machine learning\", \"Natural language processing\"). Don't include the \"Category:\" prefix.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "language": {
                        "title": "Language",
                        "enum": [
                            "en",
                            "es",
                            "fr",
                            "de",
                            "ja",
                            "pt",
                            "it",
                            "ru",
                            "zh",
                            "ko",
                            "ar",
                            "hi"
                        ],
                        "type": "string",
                        "description": "Wikipedia language edition to use.",
                        "default": "en"
                    },
                    "includeFullText": {
                        "title": "Include Full Text",
                        "type": "boolean",
                        "description": "Extract the complete article text in plain text format. Ideal for AI/ML training data and RAG pipelines.",
                        "default": true
                    },
                    "includeSections": {
                        "title": "Include Sections",
                        "type": "boolean",
                        "description": "Extract article sections with headings and text. Useful for structured content analysis.",
                        "default": true
                    },
                    "includeCategories": {
                        "title": "Include Categories",
                        "type": "boolean",
                        "description": "Extract the list of categories the article belongs to.",
                        "default": true
                    },
                    "includeLinks": {
                        "title": "Include Links",
                        "type": "boolean",
                        "description": "Extract internal Wikipedia links from the article. Can produce large output.",
                        "default": false
                    },
                    "includeImages": {
                        "title": "Include Images",
                        "type": "boolean",
                        "description": "Extract image URLs from the article via Wikimedia Commons.",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max Articles",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of articles to extract. Use 0 for unlimited.",
                        "default": 100
                    },
                    "proxyConfig": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings. Wikipedia API is public so proxies are optional, but can help avoid rate limits at scale."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```