Baidu Search Scraper
Pricing
Pay per event
Baidu Search Scraper
Scrape Baidu (百度) search results including web, news, and Baike entries. Supports multiple queries per run with pagination, SERP feature extraction, and related query harvesting. Ideal for Chinese-market SEO research, brand monitoring, and AI training data collection.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Share
Scrape Baidu (百度) search engine results for any search query. Extract organic results, news articles, Baike entries, and SERP metadata including related queries, total result estimates, and detected SERP features.
Baidu is China's dominant search engine with roughly 70% domestic market share. This scraper is the equivalent of a Google Search Scraper for the Chinese-language web — essential for any team doing SEO research, competitive intelligence, or data collection targeting the Chinese market.
What this scraper collects
For each search result, the scraper extracts:
- Title — the result headline as shown on Baidu
- URL — the real destination URL (resolved from Baidu's redirect wrapper)
- Displayed URL — the shortened URL displayed on the SERP
- Snippet — the descriptive text shown below the title
- Result type —
organic,news,baike,video - Source site — hostname of the result
- Published date — date shown for news results
- Thumbnail URL — image thumbnail when present
- Is Baidu-owned — flags results pointing to Baidu properties (Baike, Zhidao, Tieba, etc.)
- Related queries — the "related searches" shown on the SERP page
- Total results estimate — Baidu's stated result count for the query
- SERP features — detected features:
answer_box,knowledge_panel,image_pack,video_pack,news_box
Supported search modes
| Mode | URL pattern | Use case |
|---|---|---|
web | www.baidu.com/s?wd=... | Standard web search results |
news | news.baidu.com/ns?word=... | News articles from Chinese media |
Usage
Configure one or more search queries and set the search mode:
{"queries": ["人工智能", "machine learning", "python编程"],"searchType": "web","maxPages": 3,"maxItems": 100}
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
queries | array | required | Search query strings (Chinese or English) |
searchType | string | web | Search mode: web or news |
resultsPerPage | integer | 10 | Results per page (10 or 50) |
maxPages | integer | 3 | Max SERP pages per query |
maxItems | integer | 15 | Maximum total results across all queries |
Performance and anti-bot notes
Baidu's WAF blocks all data center IP addresses. This scraper uses residential proxies to bypass the block and deliver real search results. Pacing is configured to avoid triggering rate-limit responses.
Politically sensitive queries on Baidu may return reduced result counts or redirected results — this is inherent to Baidu's content policies, not a scraper defect.
Use cases
- Chinese SEO research — track keyword rankings and SERP features on Baidu
- Brand monitoring — monitor how a brand appears in Chinese search results
- Competitive intelligence — analyze which Chinese sites rank for industry keywords
- AI training data — collect Baidu SERPs as retrieval anchors for Chinese-language models
- Academic research — study information retrieval and content availability on the Chinese web