❓ Zhihu User Content Scraper avatar

❓ Zhihu User Content Scraper

Pricing

$3.00 / 1,000 results

Go to Apify Store
❓ Zhihu User Content Scraper

❓ Zhihu User Content Scraper

Extract Zhihu user content data — title, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Pricing

$3.00 / 1,000 results

Rating

0.0

(0)

Developer

Jackie Chen

Jackie Chen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Zhihu User Content Scraper

zhihu-user-content-scraper

Scrape Zhihu (知乎) users by their url_token. Get every published article (with the full HTML body), or switch modes to pull a user's profile, their followers, their followees, or the columns they follow. Each record is returned as clean, structured data: title, content, excerpt, vote-up and comment counts, author info, and the canonical URL.

Unofficial. This Actor is not affiliated with, authorized, or endorsed by Zhihu (知乎 / 智者天下科技). It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with Zhihu's terms and all applicable laws; you are responsible for how you use the retrieved data.

What it does

Give one or more Zhihu url_tokens — the slug in a profile URL zhihu.com/people/<token> (e.g. kaifulee for 李开复) — and pick a content type:

  • Articles (default) — every article the user has published, including the full HTML content, title, excerpt, vote-up / comment counts and timestamps. Paginated automatically. Sort by newest or most up-voted.
  • Profile — the user's profile object (name, headline, avatar, follower / answer counts, etc.). One record per user.
  • Followers / Followees — the people who follow, or are followed by, the user.
  • Columns — the columns the user follows.

Input

FieldTypeDefaultDescription
userUrlTokensstring[]["kaifulee"]Zhihu user url_tokens. Each is scraped independently.
contentTypeenumarticlesarticles / profile / followers / followees / columns.
sortTypeenumcreatedArticle sort: created (newest) or voteups (most up-voted). Articles only.
maxItemsinteger10Max total records across all users. Ignored for profile.

Example input

{
"userUrlTokens": ["kaifulee"],
"contentType": "articles",
"sortType": "voteups",
"maxItems": 50
}

Output

One dataset item per record. Articles look like:

{
"id": "606602766",
"type": "article",
"title": "ChatGPT引发失业恐慌?这20种工作要避开!",
"excerpt": "OpenAI新近推出的ChatGPT已经爆火出圈 ...",
"content": "<p>...full HTML body...</p>",
"url": "http://zhuanlan.zhihu.com/p/606602766",
"articleType": "normal",
"voteupCount": 354,
"commentCount": 124,
"created": 1676559521,
"updated": 1676559615,
"author": { "id": "...", "name": "李开复", "urlToken": "kaifulee", "headline": "..." },
"authorName": "李开复",
"userUrlToken": "kaifulee",
"source": "articles:kaifulee"
}

Other content types share the same flat shape (id, type, title, url, authorName, …) with mode-specific fields (e.g. followerCount for profiles, itemsCount for columns).

Notes

  • Data is sourced live; the upstream occasionally emits transient blocks and an intermittent 400, so the Actor retries with exponential backoff.
  • Records are de-duplicated by id within a run.