Stack Overflow Scraper: Questions, Answers, Users & Tags avatar

Stack Overflow Scraper: Questions, Answers, Users & Tags

Pricing

$1.00 / 1,000 result items

Go to Apify Store
Stack Overflow Scraper: Questions, Answers, Users & Tags

Stack Overflow Scraper: Questions, Answers, Users & Tags

Scrape any Stack Exchange site (stackoverflow, superuser, askubuntu, math.stackexchange and 170+ more) via the official Stack Exchange API. Questions, answers with full body, user profiles with reputation and badges, top tags, search. No auth, no proxies, no cookies. Pay only per result item.

Pricing

$1.00 / 1,000 result items

Rating

0.0

(0)

Developer

Perconey

Perconey

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

What does Stack Overflow Scraper do?

Stack Overflow Scraper pulls structured data from any Stack Exchange site through the official Stack Exchange API v2.3. Get top questions, full search results, every answer to a single question, user profiles with reputation and badges, and trending tags. The actor calls the documented public API directly, so no browser, no proxies, no cookies, no anti-bot fight, ever. One actor covers 170+ Stack Exchange sites under a single site parameter: stackoverflow, superuser, askubuntu, math.stackexchange, codereview.stackexchange, gaming.stackexchange, parenting.stackexchange, and many more.

Try it instantly: pick getQuestions, leave site as stackoverflow, click Start. You get the top 100 highest-voted Stack Overflow questions of all time with full metadata, in under 10 seconds, for about $0.10.

Why use Stack Overflow Scraper?

  • Recruiters and sourcers: Stack Overflow Talent shut down in 2024. Roll your own developer pipeline. Use searchUsers and getUserAnswers to find high-reputation engineers in a specific tag, then export contact via their linked profiles.
  • DevRel and product teams: Monitor questions tagged with your product (e.g. tensorflow, langchain, kubernetes) using getQuestionsByTag. Set up an Apify schedule to alert you on new high-vote questions about your SDK.
  • Content marketers: Use searchQuestions with sort=hot or sort=week to find trending questions worth writing about. Use getTopTags to discover where developer attention is shifting.
  • Q&A dataset builders: With getQuestionDetail + includeAnswers: true + includeBody: true, you get clean markdown Q&A pairs perfect for fine-tuning or RAG. Far cheaper than scraping Stack Overflow's own data dump (license tightened in 2024).
  • Competitive intelligence: How active is the community around a competitor's stack? Run getQuestionsByTag for their products and ours over the same date window.

How to use Stack Overflow Scraper

  1. Open the Input tab.
  2. Pick an action from the dropdown. getQuestions is the simplest starting point.
  3. Set site (default stackoverflow). To scrape a different Stack Exchange site, type its short name without .com (e.g. superuser, askubuntu).
  4. For search/profile/tag/detail actions, fill queries (one entry per line). For top-questions and top-tags actions, leave queries empty.
  5. Set maxItems to cap the run. Default 100.
  6. (Optional) Paste a free Stack Apps API key to lift the quota from 300 to 10,000 requests per day. Register at https://stackapps.com/apps/oauth/register.
  7. Click Start. Results stream to the dataset and you can preview them on the Output tab.

Input

FieldRequiredDescription
actionyesWhich API call to make. See the dropdown for the eight options.
siteyesStack Exchange site short name. Default stackoverflow.
queriessometimesRequired for search/detail/profile actions. Free text for searchQuestions/searchUsers; a tag for getQuestionsByTag; an id or full URL for getQuestionDetail, getUserProfile, getUserAnswers.
maxItemsnoMax items per query. Default 100.
sortnoAPI sort key (e.g. votes, activity, hot, week, reputation).
ordernodesc (default) or asc.
taggednoFor getQuestions / searchQuestions: limit to questions with this tag.
since / untilnoISO date filters for question-listing actions.
includeBodynoIf true, also fetch full question/answer body as markdown.
includeAnswersnoFor getQuestionDetail: also fetch all answers. Default true.
apiKeynoStack Apps app key to lift the daily quota.

Output

Every dataset item carries a _type field (question, answer, user, tag, or error) plus _action and _site for filtering when one run mixes actions. Field names match the Stack Exchange API types, with Unix timestamps converted to ISO 8601.

{
"_type": "question",
"_action": "getQuestions",
"_site": "stackoverflow",
"question_id": 11227809,
"title": "Why is processing a sorted array faster than processing an unsorted array?",
"link": "https://stackoverflow.com/questions/11227809/...",
"tags": ["java", "c++", "performance", "cpu-architecture", "branch-prediction"],
"score": 27000,
"view_count": 1900000,
"answer_count": 27,
"comment_count": 4,
"is_answered": true,
"accepted_answer_id": 11227902,
"creation_date": "2012-06-27T13:51:36.000Z",
"owner": {
"user_id": 1539405,
"display_name": "GManNickG",
"reputation": 510000,
"link": "https://stackoverflow.com/users/1539405/gmannickg"
}
}

You can download the dataset in JSON, CSV, XML, Excel, RSS or HTML format from the Output tab or the Apify API.

Data fields

TypeKey fields
questionquestion_id, title, link, tags, score, view_count, answer_count, is_answered, accepted_answer_id, creation_date, owner, body (optional)
answeranswer_id, question_id, is_accepted, score, link, creation_date, owner, body (optional)
useruser_id, display_name, reputation, link, location, website_url, about_me, badge_counts, question_count, answer_count
tagname, count (total questions tagged), is_required, has_synonyms

Pricing

Pay-per-result: $0.001 per item. One question = one event. One answer = one event. One user profile = one event. No flat monthly fee, no rental, no charge for the time the actor runs (just Apify's default compute, ~$0.0002 per typical run at 512 MB).

Cost examples:

  • Top 100 questions on a tag: $0.10
  • 500-user shortlist for a recruiting campaign (searchUsers + getUserAnswers, ~6 answers per user): $3.50
  • 1,000 Q&A pairs for a fine-tuning dataset (getQuestionDetail with includeAnswers, ~5 answers per question): $6.00

Tips

  • Get a Stack Apps key if you plan to run more than ~30 small runs per day from the same IP. It lifts the API quota from 300 to 10,000 requests/day. Free to register, no review process.
  • Use includeBody: false unless you actually need the markdown body. The response is ~10x smaller and faster.
  • getQuestionDetail is the cheapest way to get Q&A pairs because one API call returns the question, and a second returns all answers paged. With maxItems: 50 you cap at 50 answers per question.
  • Cross-site research: schedule the same input with different site values to compare communities (e.g. react tag on stackoverflow vs. softwareengineering.stackexchange).
  • Date windows: combine tagged: tensorflow with since: 2025-01-01 to see what users asked since the GPT-5 release.

FAQ, disclaimers, support

Is this legal? The actor calls Stack Exchange's official public API (api.stackexchange.com) with documented endpoints. Public read access is explicitly permitted by Stack Exchange's API Terms of Service. We send a clear User-Agent string identifying the actor. Stack Exchange content is licensed CC BY-SA - attribution required when republishing.

Why the 300-request anonymous limit? That's Stack Exchange's policy, not Apify's. Register a free app at https://stackapps.com/apps/oauth/register for 10,000/day per IP.

Will I get rate-limited? The actor reads the API's backoff hint and sleeps automatically. We also retry on 429/502/503/504 with exponential backoff. Quota and backoff are logged for transparency.

Does it cover Reddit / Quora / GitHub Discussions? No. This actor is only for Stack Exchange's 170+ sites. Each platform deserves its own actor for clean data shapes.

Bug or feature request? Open an Issue on the actor's Issues tab. I usually respond within a day.

Need a custom scraper for another Q&A platform? Bluesky? Substack? Mastodon? See my other actors at https://apify.com/perconey, or open an Issue.