# Stack Overflow Scraper (`leftwinglautus/stack-overflow-scraper`) Actor

Search and scrape Stack Overflow questions via the Stack Exchange API with filters for tags, sorting, and accepted answers.

- **URL**: https://apify.com/leftwinglautus/stack-overflow-scraper.md
- **Developed by:** [Moeeze Hassan](https://apify.com/leftwinglautus) (community)
- **Categories:** Developer tools, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Stack Overflow Scraper

Search and scrape [Stack Overflow](https://stackoverflow.com/) questions via the Stack Exchange API. The actor supports tag filtering, sorting by relevance/votes/creation date, and an accepted-answer filter, returning structured question records with body excerpts.

### Features

- Full-text search across question titles and bodies
- Comma-separated tag filtering (e.g. `python,asyncio`)
- Sort by relevance, votes, or creation date
- Optional filter to return only questions with an accepted answer
- HTML-stripped body excerpts for clean output

### Input Fields

| Field | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| `query` | string | ✅\* | `""` | Search query to match in question titles and bodies. |
| `tags` | string | ❌ | `""` | Comma-separated tags to filter by (e.g. `python,asyncio`). |
| `maxResults` | integer | ❌ | `30` | Maximum number of questions to return (1–100). |
| `sort` | string (enum) | ❌ | `relevance` | Sort questions by `relevance`, `votes`, or `creation`. |
| `order` | string (enum) | ❌ | `desc` | Sort order: `desc` or `asc`. |
| `acceptedOnly` | boolean | ❌ | `false` | If `true`, only return questions that have an accepted answer. |

> \* At least a `query` or `tags` must be provided.

### Output Fields

Each pushed dataset item is a question record with the following fields:

| Field | Type | Description |
| --- | --- | --- |
| `id` | integer | Stack Overflow question ID. |
| `title` | string | Question title. |
| `body` | string | Plain-text excerpt of the question body (max ~500 chars). |
| `tags` | array&lt;string&gt; | Tags applied to the question. |
| `score` | integer | Question score (upvotes − downvotes). |
| `answerCount` | integer | Number of answers. |
| `viewCount` | integer | Number of views. |
| `isAnswered` | boolean | Whether the question has at least one answer. |
| `link` | string | Direct URL to the question. |
| `author` | string | Display name of the question author. |
| `creationDate` | string | Creation date (ISO-8601). |

### Usage Example

#### Search for Python asyncio questions with accepted answers

```json
{
    "query": "asyncio",
    "tags": "python",
    "maxResults": 30,
    "sort": "votes",
    "order": "desc",
    "acceptedOnly": true
}
````

#### Find the most recent questions tagged `react-hooks`

```json
{
    "tags": "react-hooks",
    "maxResults": 20,
    "sort": "creation",
    "order": "desc"
}
```

#### Run via the Apify CLI

```bash
apify run stack-overflow-scraper --input '{"query":"asyncio","tags":"python","sort":"votes","acceptedOnly":true}'
```

### Data Source

Questions are fetched from the Stack Exchange API at `https://api.stackexchange.com/2.3/search` using the `withbody` filter for full body content.

# Actor input Schema

## `query` (type: `string`):

Search query to match in question titles and bodies.

## `tags` (type: `string`):

Comma-separated tags to filter by (e.g. python,asyncio).

## `maxResults` (type: `integer`):

Maximum number of questions to return.

## `sort` (type: `string`):

Sort questions by relevance, votes, or creation date.

## `order` (type: `string`):

Sort order: desc or asc.

## `acceptedOnly` (type: `boolean`):

If true, only return questions that have an accepted answer.

## Actor input object example

```json
{
  "query": "",
  "tags": "",
  "maxResults": 30,
  "sort": "relevance",
  "order": "desc",
  "acceptedOnly": false
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("leftwinglautus/stack-overflow-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("leftwinglautus/stack-overflow-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call leftwinglautus/stack-overflow-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=leftwinglautus/stack-overflow-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Stack Overflow Scraper",
        "description": "Search and scrape Stack Overflow questions via the Stack Exchange API with filters for tags, sorting, and accepted answers.",
        "version": "0.1",
        "x-build-id": "8wzRuF0qrHakUBp6j"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/leftwinglautus~stack-overflow-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-leftwinglautus-stack-overflow-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/leftwinglautus~stack-overflow-scraper/runs": {
            "post": {
                "operationId": "runs-sync-leftwinglautus-stack-overflow-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/leftwinglautus~stack-overflow-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-leftwinglautus-stack-overflow-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "query"
                ],
                "properties": {
                    "query": {
                        "title": "Query",
                        "type": "string",
                        "description": "Search query to match in question titles and bodies.",
                        "default": ""
                    },
                    "tags": {
                        "title": "Tags",
                        "type": "string",
                        "description": "Comma-separated tags to filter by (e.g. python,asyncio).",
                        "default": ""
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of questions to return.",
                        "default": 30
                    },
                    "sort": {
                        "title": "Sort",
                        "enum": [
                            "relevance",
                            "votes",
                            "creation"
                        ],
                        "type": "string",
                        "description": "Sort questions by relevance, votes, or creation date.",
                        "default": "relevance"
                    },
                    "order": {
                        "title": "Order",
                        "enum": [
                            "desc",
                            "asc"
                        ],
                        "type": "string",
                        "description": "Sort order: desc or asc.",
                        "default": "desc"
                    },
                    "acceptedOnly": {
                        "title": "Accepted Only",
                        "type": "boolean",
                        "description": "If true, only return questions that have an accepted answer.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
