Zhihu Q&A Tracker - China Hot List & Knowledge Mining
Pricing
from $100.00 / 1,000 zhihu q&a records
Zhihu Q&A Tracker - China Hot List & Knowledge Mining
Scrape Zhihu (知乎), China's Quora: the daily hot list plus keyword Q&A search. Each record has the question, top-answer excerpt, voteup count, view count and category. For China social listening, consumer research and brand monitoring. No CN account needed.
Pricing
from $100.00 / 1,000 zhihu q&a records
Rating
0.0
(0)
Developer
NexGenData
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
🇨🇳 Zhihu Q&A Tracker — Hot List & Knowledge-Mining for China Research
Pull Zhihu (知乎) — China's Quora — as structured data: the daily hot list (热榜) plus keyword-driven Q&A search, each record carrying the question, top-answer excerpt, voteup count, view count and category.
Zhihu is where mainland China goes for long-form, expert-leaning answers — the highest-signal Chinese surface for why people think something, not just that a topic is trending. This actor turns Zhihu's hot list and content search into clean JSON for China social-listening, consumer research and brand-monitoring work, without you needing a Zhihu account, a Chinese phone number or a CN VPN.
Built for market researchers, consumer-insight teams, brand monitors and offshore China-watchers who want the Zhihu signal as rows in a dataset, not a screenshot.
What you can do with it
- Consumer & market research — mine high-voteup answers for genuine expert opinion on a product, category or brand in China.
- Brand monitoring — run keyword search on your brand / competitors and read the top-answer excerpt and voteup count to gauge sentiment and depth of discussion.
- Trend tracking / social listening — capture the daily hot list (
mode: "hot") on a schedule to see what questions China is asking right now. - Offshore China-watching — read long-form Chinese knowledge content surfaced and categorised, filterable by voteup quality threshold.
What you get per record
| Field | Type | Description |
|---|---|---|
question_id | string | Zhihu question ID |
question_title | string | The question (verbatim Chinese) |
question_url | string | Canonical zhihu.com/question/{id} URL |
category | string | Canonical category classifier: tech / finance / business / education / career / lifestyle / health / entertainment / science / politics |
answer_count | int | null | Number of answers on the question, when surfaced |
view_count | int | null | Question view / heat metric, when surfaced |
top_answer_excerpt | string | null | Up to 500 chars of the highest-voted answer (HTML stripped) |
top_answer_author | string | null | Handle of the top answer's author (匿名用户 if anonymous) |
top_answer_voteup_count | int | null | Voteup count of the top answer — Zhihu's strongest quality signal |
is_hot | bool | True if the record appeared on Zhihu's hot list |
created_at | string | null | Question creation time, when surfaced |
data_source | string | Provenance — exact probe path used (e.g. api.zhihu.com/topstory/hot-list, zhihu.com/hot (initialData), zhihu.com/search) |
Most metric fields are nullable on purpose — Zhihu's anti-bot wall means not every path exposes every field on every run.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
mode | string | both | hot = daily hot list only; search = keyword search only; both = hot list, then top up with keyword search |
keywords | array | ["人工智能","新能源汽车","投资"] | Search terms (Chinese gives native-quality hits). Ignored when mode: "hot" |
categories | array | ["tech","finance","business"] | Restrict output to these category slugs. Empty = no filter |
limit | integer | 30 | Max Q&A records (1–500) |
min_voteup | integer | 0 | Drop records whose top answer has fewer than N voteups (0 = keep all) |
include_hot_only | boolean | false | When true, keep only records that appeared on the hot list |
proxyConfiguration | object | RESIDENTIAL | Apify proxy. RESIDENTIAL strongly recommended — Zhihu blocks datacenter IPs |
Sample input
{"mode": "both","keywords": ["人工智能", "新能源汽车", "投资"],"categories": ["tech", "finance", "business"],"limit": 30,"min_voteup": 0,"include_hot_only": false,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Sample output (truncated, schema-accurate)
[{"question_id": "650000001","question_title": "如何看待新能源汽车 2026 年的价格战?","question_url": "https://www.zhihu.com/question/650000001","category": "finance","answer_count": 412,"view_count": 3800000,"top_answer_excerpt": "价格战的本质是产能过剩下的份额博弈……(节选)","top_answer_author": "汽车行业分析师","top_answer_voteup_count": 5821,"is_hot": true,"created_at": "2026-04-02T09:15:00Z","data_source": "api.zhihu.com/topstory/hot-list"}]
How it gets the data
Zhihu is mainland-hosted and applies aggressive anti-bot detection (403 / interstitial captcha / empty payloads to datacenter IPs). The actor uses a probe waterfall so one blocked path does not kill the run:
- Mobile API hot list —
api.zhihu.com/topstory/hot-list. - Mobile API search —
api.zhihu.com/search_v3for keyword queries. - Public hot-list pages —
zhihu.com/hotandzhihu.com/billboard, parsing the embeddedinitialDatablob. - Public content search —
zhihu.com/search?type=content. - Question-detail enrichment —
zhihu.com/question/{id}to fill in answer count, top-answer excerpt, author and voteup. - Maintenance-stub fallback — a single
status: "maintenance"row if every path is blocked, so pipelines never crash.
All paths run behind Apify's RESIDENTIAL proxy pool by default.
FAQ
Can I scrape Zhihu without an account or Chinese phone number? Yes. The actor runs server-side behind Apify proxies and returns JSON — no Zhihu login, phone number or VPN on your side.
What is Zhihu? Zhihu (知乎) is China's largest long-form Q&A / knowledge community — the closest mainland equivalent to Quora, but with far deeper expert participation.
How do I get only the trending hot list?
Set mode: "hot" (or include_hot_only: true) to keep only hot-list questions and drop keyword-search noise.
How do I filter for high-quality answers?
Use min_voteup — 100+ usually means substantive expert content, 1000+ usually means a viral / canonical answer.
How fresh is the data?
Each run captures the hot list / search live and the record's data_source shows the exact path used. The actor has no internal scheduler — schedule it in Apify for a continuous feed.
Related actors — NexGenData China social-listening fleet
Pair Zhihu with the rest of the NexGenData Chinese-social fleet for full cross-platform coverage:
- Weibo Hot Search Tracker — China's #1 social-trending barometer (微博热搜榜): the fast real-time counterpart to Zhihu's long-form signal.
- RedNote (Xiaohongshu) Scraper — trending posts, feeds and notes from Xiaohongshu (小红书 / RedNote) for beauty/fashion/lifestyle consumer intent.
- Bilibili Video Search — keyword search across Bilibili (B站), the long-form video, gaming and education hub of mainland China.
- Douyin Trending Tracker — the live Douyin (抖音 / Chinese TikTok) hot list for short-video trend signal.
- Kuaishou Trending Tracker — trending short-video signal from Kuaishou (快手), skewing lower-tier-city audiences.
- Douban Tracker — China movie/TV ratings and hot lists from Douban (豆瓣), the cultural-taste signal.
- China Trends Tracker — cross-platform Chinese trend roll-up (Weibo / Baidu / Toutiao / Douyin).
- Chinese Social Signals MCP — MCP server that plugs the whole Chinese-social fleet directly into Claude, ChatGPT and Cursor.
Notes & limits
- Residential proxy strongly recommended. Datacenter IPs are blocked / captcha-gated; the default proxy group is RESIDENTIAL.
- Maintenance fallback is by design. A
status: "maintenance"row means the feed was temporarily blocked — retry shortly. - Excerpt is capped at 500 characters of the highest-voted answer, HTML stripped.
- Pay-per-event billing. You pay per delivered Q&A record.