Stack Exchange Scraper — Questions, Answers & Search API avatar

Stack Exchange Scraper — Questions, Answers & Search API

Pricing

from $1.30 / 1,000 question extracteds

Go to Apify Store
Stack Exchange Scraper — Questions, Answers & Search API

Stack Exchange Scraper — Questions, Answers & Search API

Scrape Stack Overflow & the Stack Exchange network into clean structured data — questions, answers, scores, views, tags, authors. Search by keyword, tag, or paste a URL; pull full Q&A threads by id. JSON/CSV/Excel. No login or API key needed.

Pricing

from $1.30 / 1,000 question extracteds

Rating

0.0

(0)

Developer

SIÁN OÜ

SIÁN OÜ

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Stack Exchange Scraper — Questions, Answers & Search Data 🚀

SIÁN Agency Store Store-Bluesky Scraper Store-Xiaohongshu Scraper Store-Instagram AI Transcript

🎉 Turn Stack Overflow & the entire Stack Exchange network into a clean, structured Q&A dataset — in seconds

Built for developers, data teams, and researchers who need questions, answers, scores, tags, and authors at scale


📋 Overview

Need Stack Overflow data without writing a single line of API code? This actor pulls public questions and answers from Stack Overflow and every Stack Exchange site (Super User, Server Fault, Ask Ubuntu, Math Overflow, Unix, DBA, Security, and 170+ more) into a tidy dataset you can export to JSON, CSV, or Excel.

Why thousands of professionals choose us:

  • Whole network coverage: one actor, any Stack Exchange site — just set the slug
  • Fast & paginated: pull hundreds of questions per run with rich fields, no rate-limit headaches
  • 🎯 Full Q&A threads: fetch a question by id and get its complete answer list — scores, accepted flag, authors
  • 💰 Pay-per-result: only pay for records you keep — transparent, best-in-class pricing
  • 💎 30+ structured fields: titles, bodies, scores, views, tags, author reputation, timestamps, license
  • No login, no API key: paste a tag, a keyword, or a URL and go

✨ Features

  • 🔎 Keyword Search: search any site by phrase and collect matching questions
  • 🏷️ Tag & Listing Mode: pull a tag's questions sorted by votes, activity, hot, week, or month
  • 🔗 Paste-a-URL: drop a listing, tag, or search URL — the site and filters are read automatically
  • 📄 Full Detail Mode: fetch specific questions by id (or URL) with their complete answer threads
  • 💬 Answer Extraction: scores, accepted-answer flag, bodies, and authors for every answer
  • Author Insights: reputation, accept rate, profile link, and avatar for each poster
  • 📊 Field Coverage Score: every record carries a 0–1 completeness signal
  • 📦 Clean Exports: JSON, CSV, and Excel straight from the Apify dataset

🎬 Quick Start

Choose a mode, give it a site (or a URL), and run. Results stream into the dataset as they're scraped. Export when done.

curl -X POST https://api.apify.com/v2/acts/sian.agency~stack-exchange-scraper/runs?token=YOUR_TOKEN \
-H 'Content-Type: application/json' \
-d '{"scrapeMode": "overview", "site": "stackoverflow", "tagged": "python", "sort": "votes"}'

🚀 Getting Started (3 Simple Steps)

Step 1: Pick a mode

Choose Overview (many questions from a site/search) or Detail (full Q&A for specific question ids).

Step 2: Set your target

Enter a site slug + tag/keyword, paste a Stack Exchange URL, or list question ids.

Step 3: Run & export

Start the actor and download your dataset as JSON, CSV, or Excel.

That's it! In under a minute, you'll have:

  • A clean table of questions with scores, views, and tags
  • Author details for every post
  • Optional full answer threads for deep dives

📥 Input Configuration

FieldTypeRequiredDescription
scrapeModestringNooverview (list/search) or detail (full Q&A by id)
sitestringNoStack Exchange site slug (default stackoverflow)
searchQuerystringNoKeyword phrase to search (Overview mode)
taggedstringNoTag filter, e.g. python or python;pandas
sortstringNoOrder: votes, activity, creation, hot, week, month, relevance
overviewUrlstringNoPaste a listing/tag/search URL instead of fields
questionIdstringNoA single question id or URL (Detail mode)
questionIdsarrayNoBulk question ids/URLs (Detail mode)
fetchAnswersbooleanNoAlso fetch each question's answers (Detail mode)
maxResultsintegerNoCap on records per run (FREE: 25, PAID: unlimited)

Example — search Stack Overflow:

{
"scrapeMode": "overview",
"site": "stackoverflow",
"searchQuery": "branch prediction",
"sort": "votes",
"maxResults": 100
}

Example — full Q&A by id:

{
"scrapeMode": "detail",
"site": "stackoverflow",
"questionIds": ["11227809", "927358"],
"fetchAnswers": true
}

📤 Output

Results are saved to the Apify dataset with 30+ fields including:

FieldTypeDescription
questionTitlestringThe question title
bodystringQuestion body (HTML)
scorenumberNet votes on the question
view_countnumberTotal views
answer_countnumberNumber of answers
is_answeredbooleanWhether it has an accepted/upvoted answer
tagsarrayTags applied to the question
owner_display_namestringAuthor name
owner_reputationnumberAuthor reputation
answersarrayFull answer list (Detail mode)
urlstringCanonical question URL

Example:

{
"id": 11227809,
"url": "https://stackoverflow.com/questions/11227809/...",
"questionTitle": "Why is processing a sorted array faster than an unsorted array?",
"score": 27536,
"view_count": 1986979,
"answer_count": 26,
"is_answered": true,
"accepted_answer_id": 11227902,
"tags": ["java", "c++", "performance", "cpu-architecture"],
"owner_display_name": "GManNickG",
"owner_reputation": 507097,
"answers": [
{ "id": 11227902, "score": 35286, "is_accepted": true, "owner_display_name": "Mysticial" }
]
}

💼 Use Cases & Examples

1. Developer Research

Engineers tracking solutions to a recurring error or library.

Input: A keyword or tag like kubernetes on Server Fault Output: Top-voted questions + accepted answers Use: Build an internal knowledge base of vetted fixes.

2. Tag & Topic Monitoring

DevRel and community teams watching a tag's activity.

Input: site=stackoverflow, tagged=your-product, sort=creation Output: Newest questions mentioning the topic Use: Spot unanswered questions and emerging issues early.

3. Dataset Building for LLMs & Research

Data scientists assembling high-quality Q&A pairs.

Input: A tag or search across one or many sites Output: Questions + full answer threads with scores Use: Curate training/eval data filtered by votes and acceptance.

4. Competitive & Market Intelligence

Product teams mining pain points around competitors.

Input: Keyword searches for competitor tools Output: Questions revealing gaps and complaints Use: Inform roadmap and positioning.

5. Content & SEO Research

Writers finding the highest-demand developer questions.

Input: sort=votes or sort=week on a tag Output: Questions ranked by engagement Use: Prioritize tutorials and docs that people actually search for.

6. Academic & Trend Analysis

Researchers studying developer behavior over time.

Input: Creation-sorted listings with timestamps Output: Time-stamped questions, view counts, tags Use: Quantify topic growth and answer dynamics.


🔗 Integration Examples

JavaScript/Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('sian.agency/stack-exchange-scraper').call({
scrapeMode: 'overview',
site: 'stackoverflow',
tagged: 'python',
sort: 'votes',
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0]);

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('sian.agency/stack-exchange-scraper').call(
run_input={'scrapeMode': 'detail', 'questionIds': ['11227809'], 'fetchAnswers': True}
)
for item in client.dataset(run['defaultDatasetId']).iterate_items():
print(item)

cURL

curl -X POST 'https://api.apify.com/v2/acts/sian.agency~stack-exchange-scraper/runs?token=YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"scrapeMode": "overview", "site": "superuser", "searchQuery": "ssh tunnel"}'

Automation Workflows (N8N / Zapier / Make)

  1. Trigger: Schedule or webhook
  2. HTTP Request: Call actor API
  3. Process: Handle JSON results
  4. Action: Save, notify, or transform

📊 Performance & Pricing

FREE Tier (Try It Now)

  • 25 records per run — full feature access, same quality
  • No credit card required
  • Perfect for testing and small projects
  • Unlimited records per run
  • Faster processing, no delays
  • Pay-per-result: only charged for successful results

💰 Best price on the market — transparent pay-per-record pricing with no subscriptions.

🔗 View current pricing


❓ Frequently Asked Questions

Q: How many records can I process? A: FREE tier: 25 per run. PAID tier: unlimited.

Q: Which Stack Exchange sites are supported? A: All of them — set the site slug (stackoverflow, superuser, serverfault, askubuntu, mathoverflow, unix, dba, security, and 170+ more).

Q: Can I get the answers, not just questions? A: Yes — use Detail mode with Fetch Answers on to get each question's full answer list.

Q: Does it work with private or deleted content? A: No, only publicly accessible questions and answers are supported.

Q: What output formats are available? A: JSON, CSV, Excel — export directly from the Apify dataset.

Q: Do I need an API key or login? A: No. Just paste a tag, keyword, or URL and run.

Q: Is this legal? A: Yes — we only extract publicly available data. See the legal section below.


🐛 Troubleshooting

No results returned

  • Check the site slug is correct (e.g. stackoverflow, not stack-overflow)
  • Make sure your tag/keyword actually matches questions on that site

Fewer records than expected

  • FREE tier is capped at 25 per run — upgrade for unlimited
  • Increase maxPages and maxResults for larger pulls

Detail mode returns nothing for an id

  • Confirm the question id exists on the chosen site (ids are site-specific)

Our actors are ethical and do not extract any private user data, such as email addresses, gender, or location. They only extract what the user has chosen to share publicly. We therefore believe that our actors, when used for ethical purposes by Apify users, are safe.

However, you should be aware that your results could contain personal data. Personal data is protected by the GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.

You can also read Apify's blog post on the legality of web scraping.

Stack Overflow, Stack Exchange, Super User, Server Fault, Ask Ubuntu, and related marks are trademarks of Stack Exchange Inc. This actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Stack Exchange Inc.


🤝 Support

Telegram Support

Join our active support community


Built by SIÁN Agency | More Tools