๐ Baidu Search Scraper
Pricing
from $4.99 / 1,000 results
๐ Baidu Search Scraper
Scrape Baidu search results at scale. Extract organic listings, answer boxes, related videos, related searches, and top searches. Supports bulk queries, proxy fallback, date filters, and device/language options for SEO and market research.
Pricing
from $4.99 / 1,000 results
Rating
0.0
(0)
Developer
Scraper Engine
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
๐ Baidu Search Scraper
The ๐ Baidu Search Scraper is a production-ready Baidu search engine scraper that extracts structured SERP data (organic listings, answer boxes, related videos, related searches, and top searches) at scale. It solves the challenge of reliable Baidu search data extraction with an intelligent proxy fallback strategy and robust parsing. Built for marketers, developers, data analysts, and researchers, this Baidu SERP scraper powers keyword tracking, market intelligence, and research workflows at scale.
What is ๐ Baidu Search Scraper?
The ๐ Baidu Search Scraper is a scalable Baidu search results scraper that collects structured SERP data programmatically. It addresses roadblocks like geo/language differences and anti-bot challenges using an automatic proxy fallback and device/language options. Ideal for SEO teams, growth marketers, analysts, and researchers, this Baidu search scraper tool enables repeatable, large-scale SERP monitoring and Baidu search data extraction for competitive insights, content planning, and research.
What data / output can you get?
Below are the fields pushed to the Apify dataset during the run. Each row represents one SERP element (organic result, answer box, related video, related/โpeople also search forโ/top searches).
| Data field | Description | Example value |
|---|---|---|
| query | The search term processed | python tutorial |
| resultType | Result category: organic, answer_box, related_video, people_also_search_for, related_search, top_search | organic |
| title | Title of organic/answer/video items | Learn Python โ Official Tutorial |
| link | URL for organic/video/related items | https://www.python.org/about/gettingstarted/ |
| snippet | Organic result snippet/description | Python is an easy to learn, powerful programming language... |
| displayedLink | Host/domain shown with the organic result | www.python.org |
| thumbnail | Image URL (when present for organic/video) | https://example.com/thumb.jpg |
| position | Organic ranking position (1-based across fetched pages) | 1 |
| richSnippet | Additional highlighted text extracted from organic result | Beginner-friendly resources |
| content | Answer box content/body | Python is a programming languageโฆ |
| source | Source citation for answer box (when available) | Baidu Baike |
| searchTerm | The related search term (for related_search, people_also_search_for, top_search) | python basics |
Notes:
- Results stream to the Apify dataset in real time and can be exported (e.g., JSON, CSV, Excel) from the platform.
- If you set the outputFile input, the actor also saves a consolidated JSON to the key-value store with summary and results_by_query for each term.
Key features
-
๐ก๏ธ Intelligent proxy fallback
Starts with no proxy by default; automatically falls back to Apify datacenter and then RESIDENTIAL proxies (up to 3 retries) if Baidu blocks requests. Once residential works, it sticks with it for all remaining requests. -
๐ Bulk queries at scale
Paste multiple Baidu search URLs or plain search terms into urls and process them all in a single run โ perfect for Baidu keyword ranking scraper workflows and large campaigns. -
๐ฅ๏ธ๐ฑ Device & language controls
Choose deviceType (desktop/mobile/tablet) for different SERP layouts and set languageLocalization (1โ3) to align with regional/language preferences โ ideal for Baidu SEO scraping tool use cases. -
๐ Time period filtering
Flexible timePeriod with startDate/endDate or daysAgo enables date-scoped Baidu search automation and trend analysis. -
๐ Real-time dataset streaming
Results are flattened and pushed row-by-row for immediate visibility (Baidu organic results, answer boxes, videos, related/โpeople also search forโ/top searches). Great for dashboards and pipelines. -
๐ฏ Fine-grained result limits
Control results with numResults per page and maxPagination (0โ10). Start from any startPage to continue pagination. -
๐พ Optional consolidated JSON export
Set outputFile to also save a summary + results_by_query object to the key-value store for easy retrieval or downstream processing. -
๐งฐ Developer-friendly on Apify
Designed for programmatic use as a Baidu SERP API via the Apify platform. Integrate with scripts, workflows, or data pipelines for Baidu SERP scraping Python and automation scenarios.
How to use ๐ Baidu Search Scraper - step by step
- Create or log in to your Apify account.
- Open the actor named baidu-search-scraper.
- Add input data in urls: either Baidu search URLs (e.g., https://www.baidu.com/s?wd=python) or plain search terms (e.g., python tutorial).
- Configure settings:
- deviceType: desktop (default), mobile, or tablet.
- languageLocalization: 1 (all languages, default), 2 (Simplified Chinese), 3 (Traditional Chinese).
- numResults and maxPagination to control volume; startPage to set the starting page.
- timePeriod with startDate/endDate or daysAgo for date filtering.
- proxyConfiguration (optional): leave unset to start without proxy; fallback kicks in automatically on block.
- outputFile (optional): set a key to save the consolidated JSON to the key-value store.
- Start the run. The actor probes connectivity and automatically applies proxy fallback if needed.
- Watch logs for progress, page fetches, and proxy events.
- Download results from the dataset as JSON/CSV/Excel, or retrieve the key-value store record (if outputFile was set).
Pro Tip: Use deviceType and languageLocalization together to compare desktop vs. mobile rankings by region and build a robust Baidu keyword research scraper workflow.
Use cases
| Use case | Description |
|---|---|
| SEO teams โ keyword ranking tracking | Monitor organic positions, answer boxes, and related searches for target keywords using a reliable Baidu SERP crawler. |
| Market research โ trend analysis | Analyze top searches and โpeople also search forโ to identify rising topics and market signals. |
| Content strategy โ SERP feature mapping | Extract answer boxes and related videos to understand content formats that surface for your topics. |
| Localization testing โ desktop vs mobile | Compare SERPs across deviceType and languageLocalization for accurate regional SEO strategies. |
| Data pipelines โ API ingestion | Stream row-based results into data lakes or analytics tools via the Apify dataset for Baidu search automation. |
| Academic research โ search behavior | Study query relationships via related_search and people_also_search_for for research on information retrieval. |
| Competitive monitoring โ SERP visibility | Track competitor visibility, links, and snippets to inform strategic decisions. |
Why choose ๐ Baidu Search Scraper?
This Baidu search results API solution is built for precision, automation, and reliability at scale.
- ๐ฏ Accurate SERP parsing: Extracts organic fields, answer boxes, related videos, and query suggestions cleanly.
- ๐ Multilingual/regional support: languageLocalization and deviceType mirror real SERPs for better coverage.
- ๐ Scales with bulk queries: Process many terms in one run for Baidu keyword ranking scraper workflows.
- ๐งช Developer access: Runs on Apify with programmatic access for pipelines and Baidu SERP scraping Python integrations.
- ๐ก๏ธ Robust & resilient: Automatic proxy fallback (none โ datacenter โ residential) with retries keeps runs stable.
- ๐พ Flexible output: Real-time row streaming to dataset plus optional consolidated JSON to key-value store.
- ๐ Better than extensions: Avoid brittle browser add-ons; use a production-grade Baidu search engine scraper with logs and infrastructure.
Bottom line: A reliable Baidu results parser and Baidu SERP scraper that balances accuracy, flexibility, and scale.
Is it legal / ethical to use ๐ Baidu Search Scraper?
Yes โ when used responsibly. This actor collects data from publicly available Baidu SERPs and does not require login or access private content.
Guidelines for compliant use:
- Collect only public SERP data and respect platform terms.
- Ensure your use complies with data protection regulations (e.g., GDPR, CCPA) and local laws.
- Do not attempt to access private or authenticated resources.
- Consult your legal team for edge cases and jurisdiction-specific requirements.
Input parameters & output format
Example JSON input
{"urls": ["python tutorial","https://www.baidu.com/s?wd=machine%20learning"],"deviceType": "desktop","languageLocalization": 1,"startPage": 1,"numResults": 10,"timePeriod": {"startDate": "","endDate": "","daysAgo": 0},"maxPagination": 3,"outputFile": "baidu_serp_summary","proxyConfiguration": {"useApifyProxy": false}}
Parameters (from the actor input schema):
- urls (array, required): Baidu search URLs (e.g., https://www.baidu.com/s?wd=python) OR plain search terms. Default: none (required).
- deviceType (string, optional): Desktop/mobile/tablet targeting. Default: "desktop".
- languageLocalization (integer, optional): 1 = All languages; 2 = Simplified Chinese; 3 = Traditional Chinese. Default: 1.
- startPage (integer, optional): Starting page number (1-based). Default: 1.
- numResults (integer, optional): Results per page (1โ50). Default: 10.
- timePeriod (object, optional): Date filter. Use:
- startDate (string): YYYY-MM-DD. Default: "".
- endDate (string): YYYY-MM-DD. Default: "".
- daysAgo (integer): Last N days (0 disables). Default: 0.
- maxPagination (integer, optional): Max pages per query (0โ10; 0 treated as up to 10 in code). Default: 3.
- outputFile (string, optional): If set, also saves the consolidated JSON to the key-value store under this key. Default: "".
- proxyConfiguration (object, optional): Apify proxy config. By default: no proxy; automatic fallback applies on block. Default: not set (no proxy).
Example dataset items (primary output)
This is what the actor pushes to the Apify dataset during the run:
[{"query": "python tutorial","resultType": "organic","title": "Learn Python โ Official Tutorial","link": "https://www.python.org/about/gettingstarted/","snippet": "Python is an easy to learn, powerful programming language...","displayedLink": "www.python.org","thumbnail": "https://example.com/thumb.jpg","position": 1,"richSnippet": "Beginner-friendly resources"},{"query": "python tutorial","resultType": "answer_box","title": "What is Python?","content": "Python is a programming language...","source": "Baidu Baike"},{"query": "python tutorial","resultType": "related_video","title": "Python Basics in 15 Minutes","link": "https://www.baidu.com/video/xyz","thumbnail": "https://example.com/video.jpg"},{"query": "python tutorial","resultType": "people_also_search_for","searchTerm": "python basics","link": "https://www.baidu.com/s?wd=python%20basics"},{"query": "python tutorial","resultType": "related_search","searchTerm": "learn python online"},{"query": "python tutorial","resultType": "top_search","searchTerm": "python download","link": "https://www.baidu.com/s?wd=python%20download"}]
Optional consolidated JSON (when outputFile is set)
If you provide outputFile, the actor also saves the following structure to the key-value store:
{"summary": {"total_queries": 2,"queries": ["python tutorial", "machine learning"],"total_organic_results": 20,"total_answer_boxes": 2,"total_related_videos": 3,"total_people_also_search_for": 10,"total_related_searches": 12,"total_top_searches": 6},"results_by_query": {"python tutorial": {"query": "python tutorial","organic_results": [...],"answer_box": [...],"related_videos": [...],"people_also_search_for": [...],"related_searches": [...],"top_searches": [...]},"machine learning": {"query": "machine learning","organic_results": [...],"answer_box": [...],"related_videos": [...],"people_also_search_for": [...],"related_searches": [...],"top_searches": [...]}}}
Note: Arrays above contain the corresponding structures as parsed from Baidu SERPs during the run.
FAQ
Does the ๐ Baidu Search Scraper work without a proxy?
Yes. By default, it starts with no proxy. If Baidu blocks a request, it automatically falls back to Apify datacenter and then RESIDENTIAL proxies with retries.
Can I start with a proxy from the beginning?
Yes. Set proxyConfiguration to enable the Apify proxy at the start. The automatic fallback still applies if a block is detected.
How do language and device settings affect results?
languageLocalization maps to Baiduโs rqlang parameter and influences regional/language results. deviceType selects between www.baidu.com (desktop) and m.baidu.com (mobile/tablet), which can change SERP layout and content.
How do I limit or expand the number of results per keyword?
Use numResults (1โ50) and maxPagination (0โ10; 0 is treated as up to 10 in the scraper). startPage lets you begin from a later page for continuation workflows.
Can I filter results by date?
Yes. Use timePeriod with either startDate/endDate or daysAgo. The scraper converts these to Baiduโs stf/stftype parameters to scope the SERP.
What data types does it capture?
It extracts organic results (title, link, snippet, displayedLink, thumbnail, position, richSnippet), answer boxes (title, content, source), related videos (title, link, thumbnail), people also search for, related searches, and top searches.
Is there an API to run this as part of a pipeline?
Yes. As an Apify actor, it can be triggered via the Apify API and integrated into pipelines for Baidu search scraping bot and automation workflows.
Can I export results to CSV or Excel?
Yes. Dataset items can be exported from the Apify platform in multiple formats such as JSON, CSV, or Excel for downstream analysis.
Closing CTA / Final thoughts
The ๐ Baidu Search Scraper is built for accurate, scalable Baidu SERP data extraction. With intelligent proxy fallback, device/language controls, and real-time dataset streaming, itโs an ideal Baidu search results API solution for marketers, developers, analysts, and researchers. Use it for SEO tracking, trend analysis, and Baidu keyword research at scale, and optionally save consolidated summaries via outputFile. Developers can run it programmatically via the Apify API to power automation pipelines. Start extracting smarter Baidu insights with a reliable, production-ready Baidu SERP scraper today.