❓ Zhihu Question Answers Scraper
Pricing
$3.00 / 1,000 results
❓ Zhihu Question Answers Scraper
Extract Zhihu question answers data — title, author, engagement, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.
Pricing
$3.00 / 1,000 results
Rating
0.0
(0)
Developer
Jackie Chen
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Zhihu Question Answers Scraper

Scrape every answer on a Zhihu (知乎) question — with the full answer body, not just a snippet. Each answer becomes one structured item: complete HTML content, a plain-text excerpt, the author's name / handle / follower count, upvotes, comment count, and timestamps. This is ideal raw material for LLM training data, Q&A-pair datasets, and content research.
Unofficial. This Actor is not affiliated with, authorized, or endorsed by Zhihu (知乎 / Zhihu Inc.). It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with Zhihu's terms and all applicable laws; you are responsible for how you use the retrieved data.
What it does
- Question answers (primary) — give one or more Zhihu question IDs; the Actor paginates through every answer on each question and emits one item per answer with the full body text.
- Answer comments (optional) — give answer IDs to also pull their top-level comments, one item per comment.
Input
| Field | Type | Default | Description |
|---|---|---|---|
questionIds | string[] | ["19550517"] | Zhihu question IDs (numeric id from zhihu.com/question/<id>). Each is paginated independently. |
order | enum | default | default (Zhihu ranking) or updated (most recently updated). |
answerIds | string[] | [] | Optional. Answer IDs to pull top-level comments for. |
maxItems | integer | 50 | Max total items (answers + comments) across all inputs. |
maxCommentsPerAnswer | integer | 50 | Cap on comments fetched per answer ID. |
Example input
{"questionIds": ["19550517", "20278289"],"order": "default","maxItems": 200}
Output
One dataset item per answer:
{"answerId": "12202199","type": "answer","url": "https://www.zhihu.com/question/19550517/answer/12202199","questionId": 19550517,"questionTitle": "Instagram 有多少用户?","excerpt": "1500 万。今年 3 月份报出来的数字 ...","content": "<p>... full answer body as HTML ...</p>","voteupCount": 2,"commentCount": 7,"createdTime": 1334057938,"updatedTime": 1334057938,"authorName": "黄继新","authorUrlToken": "jixin","authorId": "...","authorFollowerCount": 1006585,"authorHeadline": "知乎 联合创始人","authorUrl": "https://www.zhihu.com/people/jixin","source": "question:19550517"}
When answerIds is supplied, comment items are also emitted with type: "comment",
commentId, answerId, content, voteupCount, childCommentCount, and author fields.
Notes
- Data is sourced live; the upstream API occasionally returns a transient retry signal, so the Actor retries with exponential backoff and a browser User-Agent.
- Items are de-duplicated by id within a run.