Stack Exchange Scraper — Questions, Answers & Search API
Pricing
from $1.30 / 1,000 question extracteds
Stack Exchange Scraper — Questions, Answers & Search API
Scrape Stack Overflow & the Stack Exchange network into clean structured data — questions, answers, scores, views, tags, authors. Search by keyword, tag, or paste a URL; pull full Q&A threads by id. JSON/CSV/Excel. No login or API key needed.
Pricing
from $1.30 / 1,000 question extracteds
Rating
0.0
(0)
Developer
SIÁN OÜ
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Stack Exchange Scraper — Questions, Answers & Search Data 🚀
🎉 Turn Stack Overflow & the entire Stack Exchange network into a clean, structured Q&A dataset — in seconds
Built for developers, data teams, and researchers who need questions, answers, scores, tags, and authors at scale
📋 Overview
Need Stack Overflow data without writing a single line of API code? This actor pulls public questions and answers from Stack Overflow and every Stack Exchange site (Super User, Server Fault, Ask Ubuntu, Math Overflow, Unix, DBA, Security, and 170+ more) into a tidy dataset you can export to JSON, CSV, or Excel.
Why thousands of professionals choose us:
- ✅ Whole network coverage: one actor, any Stack Exchange site — just set the slug
- ⚡ Fast & paginated: pull hundreds of questions per run with rich fields, no rate-limit headaches
- 🎯 Full Q&A threads: fetch a question by id and get its complete answer list — scores, accepted flag, authors
- 💰 Pay-per-result: only pay for records you keep — transparent, best-in-class pricing
- 💎 30+ structured fields: titles, bodies, scores, views, tags, author reputation, timestamps, license
- ✨ No login, no API key: paste a tag, a keyword, or a URL and go
✨ Features
- 🔎 Keyword Search: search any site by phrase and collect matching questions
- 🏷️ Tag & Listing Mode: pull a tag's questions sorted by votes, activity, hot, week, or month
- 🔗 Paste-a-URL: drop a listing, tag, or search URL — the site and filters are read automatically
- 📄 Full Detail Mode: fetch specific questions by id (or URL) with their complete answer threads
- 💬 Answer Extraction: scores, accepted-answer flag, bodies, and authors for every answer
- ⭐ Author Insights: reputation, accept rate, profile link, and avatar for each poster
- 📊 Field Coverage Score: every record carries a 0–1 completeness signal
- 📦 Clean Exports: JSON, CSV, and Excel straight from the Apify dataset
🎬 Quick Start
Choose a mode, give it a site (or a URL), and run. Results stream into the dataset as they're scraped. Export when done.
curl -X POST https://api.apify.com/v2/acts/sian.agency~stack-exchange-scraper/runs?token=YOUR_TOKEN \-H 'Content-Type: application/json' \-d '{"scrapeMode": "overview", "site": "stackoverflow", "tagged": "python", "sort": "votes"}'
🚀 Getting Started (3 Simple Steps)
Step 1: Pick a mode
Choose Overview (many questions from a site/search) or Detail (full Q&A for specific question ids).
Step 2: Set your target
Enter a site slug + tag/keyword, paste a Stack Exchange URL, or list question ids.
Step 3: Run & export
Start the actor and download your dataset as JSON, CSV, or Excel.
That's it! In under a minute, you'll have:
- A clean table of questions with scores, views, and tags
- Author details for every post
- Optional full answer threads for deep dives
📥 Input Configuration
| Field | Type | Required | Description |
|---|---|---|---|
| scrapeMode | string | No | overview (list/search) or detail (full Q&A by id) |
| site | string | No | Stack Exchange site slug (default stackoverflow) |
| searchQuery | string | No | Keyword phrase to search (Overview mode) |
| tagged | string | No | Tag filter, e.g. python or python;pandas |
| sort | string | No | Order: votes, activity, creation, hot, week, month, relevance |
| overviewUrl | string | No | Paste a listing/tag/search URL instead of fields |
| questionId | string | No | A single question id or URL (Detail mode) |
| questionIds | array | No | Bulk question ids/URLs (Detail mode) |
| fetchAnswers | boolean | No | Also fetch each question's answers (Detail mode) |
| maxResults | integer | No | Cap on records per run (FREE: 25, PAID: unlimited) |
Example — search Stack Overflow:
{"scrapeMode": "overview","site": "stackoverflow","searchQuery": "branch prediction","sort": "votes","maxResults": 100}
Example — full Q&A by id:
{"scrapeMode": "detail","site": "stackoverflow","questionIds": ["11227809", "927358"],"fetchAnswers": true}
📤 Output
Results are saved to the Apify dataset with 30+ fields including:
| Field | Type | Description |
|---|---|---|
| questionTitle | string | The question title |
| body | string | Question body (HTML) |
| score | number | Net votes on the question |
| view_count | number | Total views |
| answer_count | number | Number of answers |
| is_answered | boolean | Whether it has an accepted/upvoted answer |
| tags | array | Tags applied to the question |
| owner_display_name | string | Author name |
| owner_reputation | number | Author reputation |
| answers | array | Full answer list (Detail mode) |
| url | string | Canonical question URL |
Example:
{"id": 11227809,"url": "https://stackoverflow.com/questions/11227809/...","questionTitle": "Why is processing a sorted array faster than an unsorted array?","score": 27536,"view_count": 1986979,"answer_count": 26,"is_answered": true,"accepted_answer_id": 11227902,"tags": ["java", "c++", "performance", "cpu-architecture"],"owner_display_name": "GManNickG","owner_reputation": 507097,"answers": [{ "id": 11227902, "score": 35286, "is_accepted": true, "owner_display_name": "Mysticial" }]}
💼 Use Cases & Examples
1. Developer Research
Engineers tracking solutions to a recurring error or library.
Input: A keyword or tag like kubernetes on Server Fault
Output: Top-voted questions + accepted answers
Use: Build an internal knowledge base of vetted fixes.
2. Tag & Topic Monitoring
DevRel and community teams watching a tag's activity.
Input: site=stackoverflow, tagged=your-product, sort=creation
Output: Newest questions mentioning the topic
Use: Spot unanswered questions and emerging issues early.
3. Dataset Building for LLMs & Research
Data scientists assembling high-quality Q&A pairs.
Input: A tag or search across one or many sites Output: Questions + full answer threads with scores Use: Curate training/eval data filtered by votes and acceptance.
4. Competitive & Market Intelligence
Product teams mining pain points around competitors.
Input: Keyword searches for competitor tools Output: Questions revealing gaps and complaints Use: Inform roadmap and positioning.
5. Content & SEO Research
Writers finding the highest-demand developer questions.
Input: sort=votes or sort=week on a tag
Output: Questions ranked by engagement
Use: Prioritize tutorials and docs that people actually search for.
6. Academic & Trend Analysis
Researchers studying developer behavior over time.
Input: Creation-sorted listings with timestamps Output: Time-stamped questions, view counts, tags Use: Quantify topic growth and answer dynamics.
🔗 Integration Examples
JavaScript/Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('sian.agency/stack-exchange-scraper').call({scrapeMode: 'overview',site: 'stackoverflow',tagged: 'python',sort: 'votes',});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0]);
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_TOKEN')run = client.actor('sian.agency/stack-exchange-scraper').call(run_input={'scrapeMode': 'detail', 'questionIds': ['11227809'], 'fetchAnswers': True})for item in client.dataset(run['defaultDatasetId']).iterate_items():print(item)
cURL
curl -X POST 'https://api.apify.com/v2/acts/sian.agency~stack-exchange-scraper/runs?token=YOUR_TOKEN' \-H 'Content-Type: application/json' \-d '{"scrapeMode": "overview", "site": "superuser", "searchQuery": "ssh tunnel"}'
Automation Workflows (N8N / Zapier / Make)
- Trigger: Schedule or webhook
- HTTP Request: Call actor API
- Process: Handle JSON results
- Action: Save, notify, or transform
📊 Performance & Pricing
FREE Tier (Try It Now)
- 25 records per run — full feature access, same quality
- No credit card required
- Perfect for testing and small projects
PAID Tier (Production Ready)
- Unlimited records per run
- Faster processing, no delays
- Pay-per-result: only charged for successful results
💰 Best price on the market — transparent pay-per-record pricing with no subscriptions.
❓ Frequently Asked Questions
Q: How many records can I process? A: FREE tier: 25 per run. PAID tier: unlimited.
Q: Which Stack Exchange sites are supported?
A: All of them — set the site slug (stackoverflow, superuser, serverfault, askubuntu, mathoverflow, unix, dba, security, and 170+ more).
Q: Can I get the answers, not just questions? A: Yes — use Detail mode with Fetch Answers on to get each question's full answer list.
Q: Does it work with private or deleted content? A: No, only publicly accessible questions and answers are supported.
Q: What output formats are available? A: JSON, CSV, Excel — export directly from the Apify dataset.
Q: Do I need an API key or login? A: No. Just paste a tag, keyword, or URL and run.
Q: Is this legal? A: Yes — we only extract publicly available data. See the legal section below.
🐛 Troubleshooting
No results returned
- Check the
siteslug is correct (e.g.stackoverflow, notstack-overflow) - Make sure your tag/keyword actually matches questions on that site
Fewer records than expected
- FREE tier is capped at 25 per run — upgrade for unlimited
- Increase
maxPagesandmaxResultsfor larger pulls
Detail mode returns nothing for an id
- Confirm the question id exists on the chosen
site(ids are site-specific)
⚖️ Is it legal to scrape data?
Our actors are ethical and do not extract any private user data, such as email addresses, gender, or location. They only extract what the user has chosen to share publicly. We therefore believe that our actors, when used for ethical purposes by Apify users, are safe.
However, you should be aware that your results could contain personal data. Personal data is protected by the GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.
You can also read Apify's blog post on the legality of web scraping.
Stack Overflow, Stack Exchange, Super User, Server Fault, Ask Ubuntu, and related marks are trademarks of Stack Exchange Inc. This actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Stack Exchange Inc.
🤝 Support
Join our active support community
- For issues or questions, open an issue in the actor's repository
- Check SIÁN Agency Store for more automation tools
- 📧 apify@sian-agency.online
Built by SIÁN Agency | More Tools