Stack Overflow Scraper - Questions, Answers & Comments avatar

Stack Overflow Scraper - Questions, Answers & Comments

Pricing

from $0.30 / 1,000 results

Go to Apify Store
Stack Overflow Scraper - Questions, Answers & Comments

Stack Overflow Scraper - Questions, Answers & Comments

Scrape questions, answers, and comments from Stack Overflow and the Stack Exchange network. Filter by tag, search, or user. Returns body, score, votes, accepted-answer flag. Built for AI/LLM training datasets, dev research, and tag-trend monitoring.

Pricing

from $0.30 / 1,000 results

Rating

0.0

(0)

Developer

NIJ KANANI

NIJ KANANI

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

๐ŸŸง Stack Overflow Scraper

Scrape Stack Overflow and the entire Stack Exchange network โ€” Server Fault, Super User, Math.SE, AskUbuntu, Data Science, etc. Pull questions by tag, search, user, or top-of-period. Optionally fetch full answers and comments.

๐ŸŽฏ The #1 source of high-quality technical Q&A on the internet โ€” perfect for AI/LLM training, sentiment analysis, and dev research.


โœจ What you can do

  • ๐Ÿท๏ธ Tag-based pulls โ€” python, machine-learning, react-native, etc.
  • ๐Ÿ” Full-text search โ€” Stack Exchange's /search/advanced
  • ๐Ÿ‘ค User activity โ€” fetch any user's questions
  • ๐Ÿ”ฅ Top of period โ€” top of week/month/all-time
  • ๐Ÿ“… Date range filter โ€” fromDate / toDate
  • ๐ŸŽฏ Min score filter โ€” only quality content
  • ๐Ÿ’ฌ Optionally fetch answers + comments for each question
  • ๐ŸŒ Works on any Stack Exchange site (site parameter)

๐Ÿš€ Quick start

{
"site": "stackoverflow",
"mode": "tag",
"tags": ["python", "machine-learning"],
"sort": "votes",
"fromDate": "2026-01-01",
"minScore": 5,
"includeAnswers": true,
"includeComments": false,
"maxItems": 500
}

๐Ÿ“ฅ Input

FieldDescription
siteSE site (stackoverflow, serverfault, superuser, math, datascience, ...)
modetag / search / user / top
tagsTag names (mode = tag)
searchQueriesFree-text queries (mode = search)
userIdsNumeric SE user IDs (mode = user)
sortactivity / creation / votes / hot / week / month
fromDate, toDateISO date range filter
minScoreSkip below this score
includeAnswersFetch all answers per question
includeCommentsFetch comments on Q (and on A if includeAnswers)
maxItemsCap per target
apiKey(optional) Free key from https://stackapps.com โ€” boosts quota 300โ†’10,000/day

๐Ÿ“ค Output (per question)

{
"site": "stackoverflow",
"type": "question",
"questionId": 12345678,
"title": "How to do X in Python?",
"body": "<p>HTML body</p>",
"bodyMarkdown": "Markdown body",
"tags": ["python", "machine-learning"],
"score": 42,
"viewCount": 9876,
"answerCount": 3,
"isAnswered": true,
"acceptedAnswerId": 99999,
"creationDate": "2026-04-15T...",
"owner": { "userId": 123, "displayName": "user", "reputation": 50000, "profileUrl": "..." },
"link": "https://stackoverflow.com/questions/12345678/...",
"answers": [
{
"answerId": 99999,
"body": "<p>Answer HTML</p>",
"score": 78,
"isAccepted": true,
"creationDate": "...",
"owner": { "userId": 456, "displayName": "expert", "reputation": 100000 }
}
],
"comments": [
{ "commentId": 555, "body": "Comment text", "score": 5, "creationDate": "...", "owner": {...} }
]
}

๐ŸŽฏ Use cases

WhoWhy
๐Ÿค– AI / LLM teamsBest-in-class technical Q&A for fine-tuning code/expert models
๐Ÿ“Š Dev relationsTrack which language/framework tags are heating up
๐ŸŽ“ ResearchersCode-discussion datasets, error-pattern analysis
๐Ÿ› ๏ธ Tool buildersMine common pain points around your stack

โš™๏ธ Tech notes

  • Uses Stack Exchange API v2.3
  • Without API key: 300 requests/day quota (per IP)
  • With free API key: 10,000 requests/day โ€” strongly recommended for large pulls
  • Auto-honors backoff field if API asks us to slow down
  • Includes both HTML body and body_markdown

โ“ FAQ

Where do I get an API key? Free at https://stackapps.com/apps/oauth/register โ€” fill in any name, leave most fields blank. Takes 30 seconds.

Are deleted/closed questions included? The API skips deleted; closed questions are returned but flagged in the data.

Schedule it? Yes. Daily pulls of new questions on your favorite tags is a perfect Apify Schedule use case.