๐ Baidu Search Scraper
Pricing
from $5.99 / 1,000 results
๐ Baidu Search Scraper
Scrape Baidu search results at scale. Extract organic listings, answer boxes, related videos, related searches, and top searches. Supports bulk queries, proxy fallback, date filters, and device/language options for SEO and market research.
Pricing
from $5.99 / 1,000 results
Rating
0.0
(0)
Developer
Scrapier
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
๐ Baidu Search Scraper
The ๐ Baidu Search Scraper is a fast, scalable Baidu SERP scraper that extracts organic listings, answer boxes, related videos, people also search for, related searches, and top searches from public Baidu results pages. It solves the challenge of collecting clean, structured SERP data at scale with a smart proxy fallback strategy and bulk query support. Built for marketers, developers, data analysts, and researchers, this Baidu search results scraper tool helps you run SEO analysis, market research, and competitor tracking with consistent, repeatable outputโunlocking automation-ready Baidu SERP data at scale.
What data / output can you get?
Below are the structured fields pushed to the Apify dataset in real time for each SERP element:
| Data field | Description | Example value |
|---|---|---|
| query | The search query associated with the result row | "python tutorial" |
| resultType | Result category: organic, answer_box, related_video, people_also_search_for, related_search, top_search | "organic" |
| title | Title of the organic result or video | "Python Tutorial - W3Schools" |
| link | Canonical URL to the result (decoded from Baidu redirect when applicable) | "https://www.w3schools.com/python/" |
| snippet | Text snippet/description for organic results | "Learn Python with examples, exercises, and projects..." |
| displayedLink | Displayed domain or path extracted from the link | "www.w3schools.com" |
| thumbnail | Image URL (if present for organic or video blocks) | "https://img.example.com/thumb.jpg" |
| position | Calculated position among organic results across pages | 1 |
| content | Answer/content text for answer boxes | "Python is a high-level programming language..." |
| source | Source attribution for answer boxes | "Baidu Baike" |
| searchTerm | Related search term (for people_also_search_for, related_search, top_search) | "learn python online" |
| richSnippet | Additional rich text extracted from organic blocks (if present) | "Beginner-friendly ยท Free certificate" |
Notes:
- Results are stored as individual rows for real-time visibility during the run.
- You can export data to JSON or CSV from the dataset.
- Optionally, if you set outputFile, a summary JSON with totals and grouped results is saved to the key-value store.
Key features
-
๐ง Intelligent proxy fallback
Starts with no proxy to save cost. If Baidu blocks, automatically falls back to datacenter and then residential proxies (3 retries). Once residential works, it sticks with it for the remaining requests. -
๐ฆ Bulk keyword scraping
Supply multiple Baidu URLs or plain search terms and process them in a single run for high-throughput workflows. -
๐ฑ Device & language targeting
Control deviceType (desktop, mobile, tablet) and languageLocalization (All, Simplified Chinese, Traditional Chinese) to compare different SERP layouts and regions. -
๐ Time period filtering
Use timePeriod to filter by startDate/endDate or daysAgo and narrow results to recent content. -
๐ Structured SERP coverage
Extracts organic results, answer boxes, related videos, people also search for, related searches, and top searches with clean fields. -
โก Real-time dataset output
Pushes each result row to the Apify dataset during the run, so you can monitor progress live and export JSON/CSV at completion. -
๐พ Optional summary export
Set outputFile to also save a consolidated JSON (with totals and results_by_query) to the key-value store for easy retrieval. -
๐ก๏ธ Production-ready robustness
Retries, fallbacks, and clear logging help keep your runs successful and predictableโeven for large batches.
How to use Baidu Search Scraper - step by step
- Create or log in to your Apify account at https://console.apify.com.
- Go to Actors and open โbaidu-search-scraperโ.
- Add your input:
- Paste Baidu search URLs or plain terms into urls (one per line).
- Choose deviceType (desktop/mobile/tablet) and languageLocalization.
- Set numResults and maxPagination to control depth.
- Optionally configure timePeriod and proxyConfiguration.
- Click Start to run the actor.
- Watch progress in real timeโrows appear in the dataset as theyโre extracted.
- Open the OUTPUT tab to view the dataset and export to JSON or CSV.
- (Optional) Set outputFile to save a consolidated summary JSON to the key-value store.
Pro Tip: To compare mobile vs. desktop rankings programmatically, run two jobs with different deviceType values and diff results by position.
Use cases
| Use case | Description |
|---|---|
| SEO research & competitor analysis | Track competitor rankings and SERP features using a reliable Baidu ranking scraper with device and language targeting. |
| Market research & trend monitoring | Monitor โtop searchesโ and โpeople also search forโ to identify rising topics and audience interests. |
| Content discovery & topic planning | Gather related searches to inform content briefs, clusters, and internal linking strategies. |
| Academic/behavioral research | Analyze SERP structures and related queries for research into search behavior in Chinese markets. |
| Bulk keyword auditing | Run large keyword sets in one batch to audit performance and identify low-competition opportunities. |
| SERP feature mapping | Capture answer boxes and related videos to understand how Baidu SERP features influence visibility. |
Why choose Baidu Search Scraper?
Build for precision and scale, this Baidu search engine scraper delivers structured SERP data with smart proxy management and clean output.
- ๐ฏ Accurate, structured output with clearly defined fields per result type
- ๐ Language and device controls for regional and layout comparisons
- ๐ Scales to large keyword lists with consistent performance
- ๐จโ๐ป Developer-friendly JSON/CSV exports via the Apify dataset
- ๐ก๏ธ Safe and ethical: collects only publicly available data
- ๐ธ Cost-aware: no proxy by default, with automatic fallback only when needed
- ๐งฑ More reliable than browser extensions or ad-hoc tools, with robust retries and logging
Bottom line: a dependable Baidu SERP data extractor thatโs production-ready for recurring workflows.
Is it legal / ethical to use Baidu Search Scraper?
Yesโwhen used responsibly. This actor collects data from publicly available Baidu search results and does not access private or password-protected content. As with any web data collection:
- Only scrape publicly available information.
- Ensure compliance with applicable regulations (e.g., GDPR, CCPA).
- Review Baiduโs terms and your organizationโs policies.
- Do not use the tool for spam or misuse of data.
Users are responsible for ensuring legal compliance for their specific use case.
Input parameters & output format
Example JSON input
{"urls": ["python tutorial","machine learning"],"deviceType": "desktop","languageLocalization": 1,"startPage": 1,"numResults": 10,"timePeriod": {"daysAgo": 7},"maxPagination": 3,"outputFile": "baidu_serp_summary","proxyConfiguration": {"useApifyProxy": false}}
Input parameters
- urls (array, required)
Description: Baidu search URLs (e.g., https://www.baidu.com/s?wd=python) OR plain search terms. Add one per line for bulk scraping. Default: none - deviceType (string)
Description: Desktop = www.baidu.com (default). Mobile/Tablet = m.baidu.com. Use to scrape mobile vs desktop SERP. Default: "desktop" - languageLocalization (integer)
Description: 1 = All languages (default). 2 = Simplified Chinese (็ฎไฝไธญๆ). 3 = Traditional Chinese (็น้ซไธญๆ). Default: 1 - startPage (integer)
Description: Page number to start scraping from. 1 = first page. Default: 1 - numResults (integer)
Description: Number of results per page (1โ50). Baidu typically shows 10. Default: 10 - timePeriod (object)
Description: Optional date range filter. Use startDate + endDate (YYYY-MM-DD) for custom range, or daysAgo for โlast N daysโ. Default: empty object with defaults- startDate (string) โ From date (YYYY-MM-DD). Default: ""
- endDate (string) โ To date (YYYY-MM-DD). Default: ""
- daysAgo (integer) โ Alternative: filter to last N days. Set 0 to disable. Default: 0
- maxPagination (integer)
Description: Max pages to scrape per query. 0 = no limit (capped at 10). Default: 3 - outputFile (string)
Description: Optional custom key for key-value store. Results are always saved to the Apify dataset; if set, also saves a consolidated JSON to KVS with this name. Default: "" - proxyConfiguration (object)
Description: By default: no proxy. If Baidu blocks โ datacenter โ residential (3 retries). Enable Apify proxy here if you want to start with proxy (fallback still applies). Default: { "useApifyProxy": false }
Example dataset row output (pushed during the run)
{"query": "python tutorial","resultType": "organic","title": "Python Tutorial - W3Schools","link": "https://www.w3schools.com/python/","snippet": "Learn Python with examples, exercises, and projects...","displayedLink": "www.w3schools.com","thumbnail": "https://img.example.com/thumb.jpg","position": 1,"richSnippet": "Beginner-friendly ยท Free certificate"}
Other result types use the same row structure with type-specific fields:
- answer_box rows include: title, content, source.
- related_video rows include: title, link, thumbnail.
- people_also_search_for, related_search, top_search rows include: searchTerm and (when available) link.
Optional summary JSON (saved to key-value store when outputFile is set)
{"summary": {"total_queries": 2,"queries": ["python tutorial", "machine learning"],"total_organic_results": 20,"total_answer_boxes": 2,"total_related_videos": 1,"total_people_also_search_for": 8,"total_related_searches": 10,"total_top_searches": 6},"results_by_query": {"python tutorial": {"query": "python tutorial","organic_results": [],"answer_box": [],"related_videos": [],"people_also_search_for": [],"related_searches": [],"top_searches": []}}}
FAQ
Does it work without a proxy?
Yes. By default it uses no proxy. If Baidu blocks requests, it automatically falls back to Apify datacenter proxy and then residential proxy with up to 3 retries.
Can I use my own proxy or start with a proxy?
Yes. Configure proxyConfiguration in the input to enable Apify Proxy from the start. The automatic fallback still applies if a block occurs.
Can I target mobile vs. desktop SERPs?
Yes. Set deviceType to desktop, mobile, or tablet. Mobile/Tablet uses m.baidu.com, which can produce different SERP layouts and results.
How do I filter results by date?
Use timePeriod. Provide startDate and endDate for a custom range, or set daysAgo (e.g., 7 for โlast weekโ). Leave it empty to disable filtering.
How many results can I extract per query?
Control depth with numResults (1โ50 per page) and maxPagination (0โ10 pages; 0 caps at 10). The actor aggregates organic positions across pages.
What data types are included beyond organic results?
In addition to organic results, the scraper extracts answer boxes, related videos, people also search for, related searches, and top searches when present.
Where do results go and how can I export them?
Rows are pushed to the Apify dataset during the run. You can view them in the OUTPUT tab and export to JSON or CSV. If you set outputFile, a consolidated summary JSON is also saved to the key-value store.
Is this a Baidu SERP API I can use with Python?
You can run the actor on Apify and access results programmatically via the dataset (download JSON/CSV) to integrate with Python or other workflows, effectively using it as a Baidu search results API for your pipelines.
Final thoughts
The ๐ Baidu Search Scraper is built for structured, scalable Baidu SERP data extraction. With intelligent proxy fallback, bulk query support, and precise output fields, itโs ideal for marketers, developers, analysts, and researchers. Export clean JSON/CSV from the dataset or save a consolidated summary to the key-value store for downstream automation. Start extracting smarter Baidu SEO insights and build repeatable workflows for analysis, enrichment, and reporting.