Douban Pro Scraper — Reviews, Discussions & Subject Data
Pricing
from $30.00 / 1,000 review scrapeds
Douban Pro Scraper — Reviews, Discussions & Subject Data
Scrape long-form reviews, comments, and group discussions from Douban (豆瓣) — China's leading reviews + interest community. Movies, books, music, plus subject search. Built for Chinese-LLM training corpus, sentiment analysis, and academic NLP research. Pure HTTP, no auth.
Pricing
from $30.00 / 1,000 review scrapeds
Rating
0.0
(0)
Developer
Sami
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
14 hours ago
Last modified
Categories
Share
Douban Scraper — Reviews, Comments & Group Discussions
Extract long-form reviews, ratings, comments, and group discussions from Douban (豆瓣) — China's leading reviews + interest community. Movies, books, and music. No API key, no browser, no VPN. Best Douban data source for Chinese AI training corpora and consumer research in 2026.
How to scrape Douban in 3 easy steps
- Go to the Douban Scraper page on Apify and click "Try for free"
- Configure your input — choose a mode (
subject_reviews,subject_comments,group_topic, orsubject_search), enter your Douban URLs or query, and set the number of results - Click "Run", wait for the scraper to finish, then download your data in JSON, CSV, or Excel format
No coding required. No API key. Works with Apify's free plan.
🏢 Production pipeline running 1,000+ items per week?
I offer custom output schemas matched to your data warehouse, dedicated proxy infrastructure for sustained throughput, schema stability SLA (no breaking changes without 30-day notice), and volume pricing above 50K items/month.
DM me on Apify, open an Issue with subject "Enterprise inquiry", or email samimassis2002@gmail.com with subject "Douban enterprise".
Part of the Chinese Digital Intelligence Suite
Built by Zhorex — the only developer on Apify specializing in Chinese platforms:
- Weibo Scraper — China's Twitter (microblogging, hot search, public opinion)
- Bilibili Scraper — China's YouTube (video, danmaku, creator analytics)
- RedNote (Xiaohongshu) Scraper — China's Instagram + Pinterest
- RedNote Shop Scraper — RedShop e-commerce data
- Douban Scraper — You are here (reviews, ratings, group discussions)
Together, these cover the five pillars of Chinese digital intelligence: microblogging, video, social commerce, e-commerce, and reviews.
What is Douban?
Douban (豆瓣) is China's reviews and interest-community platform — Goodreads + Letterboxd + Rate Your Music + niche-Reddit fused into one site, with 200M+ monthly users. It's where Chinese readers, cinephiles, music fans, and hobby communities post the longest-form opinion content on the Chinese internet. Movies, books, music, TV shows, and tens of thousands of user-run discussion groups.
For anyone building a Chinese-language LLM, sentiment classifier, or consumer research dataset, Douban is the densest source of opinion-rich long-form Chinese text outside of Zhihu.
Modes
| Mode | What it does | Records |
|---|---|---|
subject_reviews | Long-form reviews (500-5,000+ Chinese chars each) for a movie/book/music album | One per review |
subject_comments | Short comments + star ratings under a subject's discussion page | One per comment |
subject_search | Search Douban for movies / books / music by keyword | One per result |
group_topic ⚠️ Beta | Pull a discussion thread + its replies from a Douban Group | One per topic (with nested replies) |
v1.0 Known Limitations (read before buying)
- Movie comments require browser rendering. Douban serves movie short-comments through a JS-only mobile widget that v1.0 cannot extract —
subject_commentsfor movies returns a diagnostic record explaining the limitation. Usesubject_reviewsfor movie data instead — long-form movie reviews are richer for AI training anyway. Books and music short-comments work normally. - Movie review list bodies are excerpt-only by default. Mobile movie list pages don't expose author / publication date / full body — only review IDs, titles, and ratings. Set
fetchFullReviewBody: true(default) to fetch each review's detail page and fill in the full markdown body. - Book / music comment coverage varies by subject. Popular subjects (rating count > 10K — e.g. 三体, OK Computer) reliably serve inline short-comments; some less-popular books have begun moving comment lists to AJAX-only rendering and will return 0 records. If
subject_commentsreturns 0 records for a URL, fall back tosubject_reviews(which works on all subjects). - Movie search returns Douban's tag-matched feed. For precise targeting of a specific film, use
subject_reviewswith the movie's subject URL directly. - Book search caps at ~10 discovery results per query. Douban's book suggestion endpoint doesn't paginate. For bulk book review extraction, supply multiple subject URLs to
subject_reviewsmode. group_topicmode is Beta. Works on most current public topics; some IDs return 403 (moderated) or 404 (deleted). When a topic fails, the run logs a warning and continues — you are not charged for failed topics.- Residential proxies are strongly recommended (default in input). Datacenter IPs degrade movie-mode and may trigger generic anti-bot challenges.
Use Cases
| Who | Why |
|---|---|
| AI / LLM training data buyers | Densest source of Chinese long-form opinion text outside Zhihu — key for Chinese-language model fine-tuning |
| Sentiment analysis researchers | Star-rating-labelled Chinese review text, ideal for supervised sentiment classifiers |
| Brand monitoring teams | Find Chinese consumer reviews mentioning your product, competitor films, or book titles |
| Cultural trend analysts | Track which films / books / albums are gaining traction in Chinese-speaking markets |
| Academic NLP researchers | Pre-built corpus of opinion text with engagement metrics — citable in cross-cultural studies |
| Localization / translation teams | Real Chinese phrasing patterns for entertainment vocabulary |
Scrape Douban with Python, JavaScript, or no code
You can use the Douban Scraper directly from the Apify Console (no code), or integrate it into your scripts.
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("zhorex/douban-scraper").call(run_input={"mode": "subject_reviews","subjectUrls": ["https://book.douban.com/subject/1084336/"],"maxResults": 50,})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('zhorex/douban-scraper').call({mode: 'subject_reviews',subjectUrls: ['https://book.douban.com/subject/1084336/'],maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach((item) => console.log(item));
Input examples
1. Subject reviews (long-form)
Pull long-form reviews for one or more movies, books, or music albums. Provide subject URLs or numeric subject IDs.
{"mode": "subject_reviews","subjectUrls": ["https://movie.douban.com/subject/1292052/","https://book.douban.com/subject/1084336/","https://music.douban.com/subject/1419463/"],"maxResults": 100,"fetchFullReviewBody": true}
2. Subject comments + ratings
Short comments + star ratings for a book or music album. Movie comments are not supported in v1.0 — use subject_reviews mode for movies.
{"mode": "subject_comments","subjectUrls": ["https://book.douban.com/subject/1084336/"],"maxResults": 200}
3. Subject search
Search Douban for movies, books, or music by keyword. Returns Douban's discovery feed for that query.
{"mode": "subject_search","searchQuery": "三体","searchType": "all","maxResults": 30}
4. Group topic (Beta)
Pull one or more group discussion threads with embedded replies.
{"mode": "group_topic","topicUrls": ["https://www.douban.com/group/topic/319929381/"],"maxRepliesPerTopic": 100}
Output examples
Review record
{"type": "review","reviewId": "1000104","subjectId": "1084336","subjectName": "小王子","subjectType": "book","title": "长大就笨了","content": "(Full review body in markdown — Chinese long-form text)","rating": 5,"ratingLabel": "力荐","authorUsername": "大头绿豆","authorUrl": "https://www.douban.com/people/bighead/","authorAvatarUrl": "https://img3.doubanio.com/icon/u1000152-23.jpg","publishedAt": "2005-04-06 11:51:52","publishedAtIso": "2005-04-06T03:51:52Z","stats": { "replyCount": 444 },"reviewUrl": "https://book.douban.com/review/1000104/","scrapedAt": "2026-05-13T01:39:22Z"}
Comment record
{"type": "comment","commentId": "10287387","subjectId": "1084336","subjectName": "小王子","subjectType": "book","content": "十几岁的时候渴慕着小王子,一天之间可以看四十四次日落。","rating": 5,"ratingLabel": "力荐","authorUsername": "眠去","authorUrl": "https://www.douban.com/people/rebekah/","publishedAt": "2007-02-08 11:16:40","publishedAtIso": "2007-02-08T03:16:40Z","stats": { "votesCount": 9232 },"scrapedAt": "2026-05-13T01:39:22Z"}
Subject (search result)
{"type": "subject","subjectId": "2567698","subjectName": "三体","subjectType": "book","year": "2008","author": "刘慈欣","rating": null,"cover": "https://img1.doubanio.com/view/subject/s/public/s2768378.jpg","subjectUrl": "https://book.douban.com/subject/2567698/","scrapedAt": "2026-05-13T01:39:22Z"}
Group topic record (Beta)
{"type": "group_topic","topicId": "319929381","groupName": "(Group name)","title": "(Discussion title)","content": "(Topic body in markdown — Chinese long-form text)","authorUsername": "(Author handle)","publishedAt": "2026-04-01 10:20:30","publishedAtIso": "2026-04-01T02:20:30Z","stats": { "replyCount": 50 },"replies": [{"replyId": "...","authorUsername": "...","content": "...","publishedAt": "...","votesCount": 12}],"topicUrl": "https://www.douban.com/group/topic/319929381/","scrapedAt": "2026-05-13T01:39:22Z"}
Pricing
Pay per result — no monthly fee, no minimum, free trial included.
| Event | Price | When charged |
|---|---|---|
review-scraped | $0.030 | Per long-form review record extracted |
comment-scraped | $0.005 | Per short comment record extracted |
group-topic-scraped | $0.030 | Per group topic (with embedded replies) |
subject-search-result | $0.005 | Per search result row |
Concrete cost examples:
- 100 long-form reviews of one popular movie's reviews page: $3.00
- 1,000 short comments across multiple books: $5.00
- 50 group discussions with replies: $1.50
- 200 search results to seed a crawl: $1.00
Diagnostic / log records (e.g. movie comment limitation notices) are NEVER charged.
Content is in Chinese
All content is returned in the original Simplified Chinese. Douban is a Chinese-language platform — reviews, comments, group discussions, and user names are in Chinese.
If you need English translations, pipe the output through a translation API (Google Translate, DeepL, or Claude).
Technical Details
- No browser — pure HTTP, runs in 512MB RAM
- No login required — works against publicly accessible content only
- Built-in rate limiting — exponential backoff on 429 / 503
- Globally accessible — residential proxy recommended (default in input)
- UTF-8 throughout — Chinese text round-trips cleanly
- Markdown review bodies —
<p>,<a>,<strong>etc. converted to lightweight markdown for downstream LLM ingestion
FAQ
Is there a Douban API?
Douban's official developer API has been deprecated for several years. There is no working public Douban API for international developers. This Douban Scraper is the best Douban data source in 2026 — it extracts reviews, comments, ratings, and group discussions from publicly accessible web endpoints.
Do I need a Douban login or cookies?
No. All four modes work against publicly accessible content. Login-walled content (private groups, blocked users) is not in scope.
Why are movie comments not supported?
Douban serves movie short-comments through a JavaScript-only widget on the mobile site that requires headless browser execution. v1.0 returns long-form movie reviews instead, which contain richer opinion text and are the primary value for AI training data. Books and music short-comments work normally.
Can I scrape Douban in Python?
Yes. Install the Apify Python client (pip install apify-client), then call the zhorex/douban-scraper actor. See the Python code example above.
How much does it cost to scrape Douban?
Each record type has its own price (see the Pricing table). A typical research run extracting 100 movie reviews costs about $3. There is no monthly fee or minimum spend — pay only for what you extract. Diagnostic records (e.g. movie-comment-mode limitation notices) are never charged.
Is scraping Douban legal?
This scraper accesses publicly available content through Douban's public web endpoints. It does not bypass authentication and does not access private/locked content. Always review your local laws and Douban's terms of service before scraping.
What if a group topic URL fails?
Group topics are marked Beta in v1.0. Most public group topic URLs work; some may fail (private group, moderated topic, deleted post). When a topic fails, the run logs a warning and continues with the next URL — you are not charged for failed topics.
What is the best Douban scraper in 2026?
The Douban Scraper by Zhorex — covers reviews, comments, group discussions, and search across movies, books, and music. Built specifically for Chinese AI training data buyers and sentiment research teams. Part of the Chinese Digital Intelligence Suite (Weibo, Bilibili, RedNote, Douban).
Integrations & data export
The Douban Scraper integrates with your existing workflow:
- Google Sheets — Send scraped reviews + ratings directly to a spreadsheet
- Zapier / Make / n8n — Automate workflows triggered by new Douban records
- REST API — Call the actor programmatically and retrieve data via Apify's REST API
- Webhooks — Get notified when a run finishes
- Data formats — Download as JSON, CSV, Excel, XML, or RSS
More scrapers by Zhorex
Chinese Digital Intelligence Suite
- Weibo Scraper — China's Twitter (microblog, hot search)
- Bilibili Scraper — China's YouTube (video, danmaku, creators)
- RedNote (Xiaohongshu) Scraper — China's Instagram + Pinterest
- RedNote Shop Scraper — RedShop e-commerce
Reviews & ratings (cross-vertical)
- Letterboxd Scraper — Western film reviews and ratings
- G2 Reviews Scraper — B2B software reviews
- Capterra Reviews Scraper — Software product reviews
- Booking.com Reviews Scraper — Hotel reviews
- Review Intelligence Aggregator — Multi-source review aggregation
Streaming & video
- Twitch Scraper — Twitch profiles, live streams, clips, VODs
- Kick Scraper — Kick.com profiles, streams, clips
- YouTube Shorts Scraper Pro — Shorts metadata + analytics
Markets & alt-data
- TradingView Scraper — Stocks, crypto, forex, indices
Other tools
- Perplexity AI Scraper — AI-powered search results
- Telegram Channel Scraper — Public Telegram channel messages
- Tech Stack Detector — Detect technologies used by websites
- LinkedIn Company Enrichment — Enrich company records
- Domain Authority Checker — Domain SEO metrics
- Phone Number Validator — Validate and format phone numbers
- Sneaker Price Tracker — Track sneaker prices
Support
Found a bug or want a new field? Open an issue on the Actor's Issues page — typical response within 48 hours.
💡 Found this Actor useful? Please leave a star rating — it helps other users discover this tool.