Pricing

Pay per usage

🔍 Baidu Search Scraper

Scrape Baidu search results at scale. Extract organic listings, answer boxes, related videos, related searches, and top searches. Supports bulk queries, proxy fallback, date filters, and device/language options for SEO and market research.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Scrapier

Actor stats

Bookmarked

Total users

Monthly active users

24 days ago

Last modified

🔍 Baidu Search Scraper

The 🔍 Baidu Search Scraper is a fast, scalable Baidu SERP scraper that extracts organic listings, answer boxes, related videos, people also search for, related searches, and top searches from public Baidu results pages. It solves the challenge of collecting clean, structured SERP data at scale with a smart proxy fallback strategy and bulk query support. Built for marketers, developers, data analysts, and researchers, this Baidu search results scraper tool helps you run SEO analysis, market research, and competitor tracking with consistent, repeatable output—unlocking automation-ready Baidu SERP data at scale.

What data / output can you get?

Below are the structured fields pushed to the Apify dataset in real time for each SERP element:

Data field	Description	Example value
query	The search query associated with the result row	"python tutorial"
resultType	Result category: organic, answer_box, related_video, people_also_search_for, related_search, top_search	"organic"
title	Title of the organic result or video	"Python Tutorial - W3Schools"
link	Canonical URL to the result (decoded from Baidu redirect when applicable)	"https://www.w3schools.com/python/"
snippet	Text snippet/description for organic results	"Learn Python with examples, exercises, and projects..."
displayedLink	Displayed domain or path extracted from the link	"www.w3schools.com"
thumbnail	Image URL (if present for organic or video blocks)	"https://img.example.com/thumb.jpg"
position	Calculated position among organic results across pages	1
content	Answer/content text for answer boxes	"Python is a high-level programming language..."
source	Source attribution for answer boxes	"Baidu Baike"
searchTerm	Related search term (for people_also_search_for, related_search, top_search)	"learn python online"
richSnippet	Additional rich text extracted from organic blocks (if present)	"Beginner-friendly · Free certificate"

Notes:

Results are stored as individual rows for real-time visibility during the run.
You can export data to JSON or CSV from the dataset.
Optionally, if you set outputFile, a summary JSON with totals and grouped results is saved to the key-value store.

Key features

🧠 Intelligent proxy fallback
Starts with no proxy to save cost. If Baidu blocks, automatically falls back to datacenter and then residential proxies (3 retries). Once residential works, it sticks with it for the remaining requests.
📦 Bulk keyword scraping
Supply multiple Baidu URLs or plain search terms and process them in a single run for high-throughput workflows.
📱 Device & language targeting
Control deviceType (desktop, mobile, tablet) and languageLocalization (All, Simplified Chinese, Traditional Chinese) to compare different SERP layouts and regions.
📅 Time period filtering
Use timePeriod to filter by startDate/endDate or daysAgo and narrow results to recent content.
📊 Structured SERP coverage
Extracts organic results, answer boxes, related videos, people also search for, related searches, and top searches with clean fields.
⚡ Real-time dataset output
Pushes each result row to the Apify dataset during the run, so you can monitor progress live and export JSON/CSV at completion.
💾 Optional summary export
Set outputFile to also save a consolidated JSON (with totals and results_by_query) to the key-value store for easy retrieval.
🛡️ Production-ready robustness
Retries, fallbacks, and clear logging help keep your runs successful and predictable—even for large batches.

How to use Baidu Search Scraper - step by step

Create or log in to your Apify account at https://console.apify.com.
Go to Actors and open “baidu-search-scraper”.
Add your input:
- Paste Baidu search URLs or plain terms into urls (one per line).
- Choose deviceType (desktop/mobile/tablet) and languageLocalization.
- Set numResults and maxPagination to control depth.
- Optionally configure timePeriod and proxyConfiguration.
Click Start to run the actor.
Watch progress in real time—rows appear in the dataset as they’re extracted.
Open the OUTPUT tab to view the dataset and export to JSON or CSV.
(Optional) Set outputFile to save a consolidated summary JSON to the key-value store.

Pro Tip: To compare mobile vs. desktop rankings programmatically, run two jobs with different deviceType values and diff results by position.

Use cases

Use case	Description
SEO research & competitor analysis	Track competitor rankings and SERP features using a reliable Baidu ranking scraper with device and language targeting.
Market research & trend monitoring	Monitor “top searches” and “people also search for” to identify rising topics and audience interests.
Content discovery & topic planning	Gather related searches to inform content briefs, clusters, and internal linking strategies.
Academic/behavioral research	Analyze SERP structures and related queries for research into search behavior in Chinese markets.
Bulk keyword auditing	Run large keyword sets in one batch to audit performance and identify low-competition opportunities.
SERP feature mapping	Capture answer boxes and related videos to understand how Baidu SERP features influence visibility.

Why choose Baidu Search Scraper?

Build for precision and scale, this Baidu search engine scraper delivers structured SERP data with smart proxy management and clean output.

🎯 Accurate, structured output with clearly defined fields per result type
🌍 Language and device controls for regional and layout comparisons
📈 Scales to large keyword lists with consistent performance
👨‍💻 Developer-friendly JSON/CSV exports via the Apify dataset
🛡️ Safe and ethical: collects only publicly available data
💸 Cost-aware: no proxy by default, with automatic fallback only when needed
🧱 More reliable than browser extensions or ad-hoc tools, with robust retries and logging

Bottom line: a dependable Baidu SERP data extractor that’s production-ready for recurring workflows.

Is it legal / ethical to use Baidu Search Scraper?

Yes—when used responsibly. This actor collects data from publicly available Baidu search results and does not access private or password-protected content. As with any web data collection:

Only scrape publicly available information.
Ensure compliance with applicable regulations (e.g., GDPR, CCPA).
Review Baidu’s terms and your organization’s policies.
Do not use the tool for spam or misuse of data.

Users are responsible for ensuring legal compliance for their specific use case.

Input parameters & output format

Example JSON input

{
  "urls": [
    "python tutorial",
    "machine learning"
  ],
  "deviceType": "desktop",
  "languageLocalization": 1,
  "startPage": 1,
  "numResults": 10,
  "timePeriod": {
    "daysAgo": 7
  },
  "maxPagination": 3,
  "outputFile": "baidu_serp_summary",
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

Input parameters

urls (array, required)
Description: Baidu search URLs (e.g., https://www.baidu.com/s?wd=python) OR plain search terms. Add one per line for bulk scraping. Default: none
deviceType (string)
Description: Desktop = www.baidu.com (default). Mobile/Tablet = m.baidu.com. Use to scrape mobile vs desktop SERP. Default: "desktop"
languageLocalization (integer)
Description: 1 = All languages (default). 2 = Simplified Chinese (简体中文). 3 = Traditional Chinese (繁體中文). Default: 1
startPage (integer)
Description: Page number to start scraping from. 1 = first page. Default: 1
numResults (integer)
Description: Number of results per page (1–50). Baidu typically shows 10. Default: 10
timePeriod (object)
Description: Optional date range filter. Use startDate + endDate (YYYY-MM-DD) for custom range, or daysAgo for “last N days”. Default: empty object with defaults
- startDate (string) – From date (YYYY-MM-DD). Default: ""
- endDate (string) – To date (YYYY-MM-DD). Default: ""
- daysAgo (integer) – Alternative: filter to last N days. Set 0 to disable. Default: 0
maxPagination (integer)
Description: Max pages to scrape per query. 0 = no limit (capped at 10). Default: 3
outputFile (string)
Description: Optional custom key for key-value store. Results are always saved to the Apify dataset; if set, also saves a consolidated JSON to KVS with this name. Default: ""
proxyConfiguration (object)
Description: By default: no proxy. If Baidu blocks → datacenter → residential (3 retries). Enable Apify proxy here if you want to start with proxy (fallback still applies). Default: { "useApifyProxy": false }

Example dataset row output (pushed during the run)

{
  "query": "python tutorial",
  "resultType": "organic",
  "title": "Python Tutorial - W3Schools",
  "link": "https://www.w3schools.com/python/",
  "snippet": "Learn Python with examples, exercises, and projects...",
  "displayedLink": "www.w3schools.com",
  "thumbnail": "https://img.example.com/thumb.jpg",
  "position": 1,
  "richSnippet": "Beginner-friendly · Free certificate"
}

Other result types use the same row structure with type-specific fields:

answer_box rows include: title, content, source.
related_video rows include: title, link, thumbnail.
people_also_search_for, related_search, top_search rows include: searchTerm and (when available) link.

Optional summary JSON (saved to key-value store when outputFile is set)

{
  "summary": {
    "total_queries": 2,
    "queries": ["python tutorial", "machine learning"],
    "total_organic_results": 20,
    "total_answer_boxes": 2,
    "total_related_videos": 1,
    "total_people_also_search_for": 8,
    "total_related_searches": 10,
    "total_top_searches": 6
  },
  "results_by_query": {
    "python tutorial": {
      "query": "python tutorial",
      "organic_results": [],
      "answer_box": [],
      "related_videos": [],
      "people_also_search_for": [],
      "related_searches": [],
      "top_searches": []
    }
  }
}

FAQ

Does it work without a proxy?

Yes. By default it uses no proxy. If Baidu blocks requests, it automatically falls back to Apify datacenter proxy and then residential proxy with up to 3 retries.

Can I use my own proxy or start with a proxy?

Yes. Configure proxyConfiguration in the input to enable Apify Proxy from the start. The automatic fallback still applies if a block occurs.

Can I target mobile vs. desktop SERPs?

Yes. Set deviceType to desktop, mobile, or tablet. Mobile/Tablet uses m.baidu.com, which can produce different SERP layouts and results.

How do I filter results by date?

Use timePeriod. Provide startDate and endDate for a custom range, or set daysAgo (e.g., 7 for “last week”). Leave it empty to disable filtering.

How many results can I extract per query?

Control depth with numResults (1–50 per page) and maxPagination (0–10 pages; 0 caps at 10). The actor aggregates organic positions across pages.

What data types are included beyond organic results?

In addition to organic results, the scraper extracts answer boxes, related videos, people also search for, related searches, and top searches when present.

Where do results go and how can I export them?

Rows are pushed to the Apify dataset during the run. You can view them in the OUTPUT tab and export to JSON or CSV. If you set outputFile, a consolidated summary JSON is also saved to the key-value store.

Is this a Baidu SERP API I can use with Python?

You can run the actor on Apify and access results programmatically via the dataset (download JSON/CSV) to integrate with Python or other workflows, effectively using it as a Baidu search results API for your pipelines.

Final thoughts

The 🔍 Baidu Search Scraper is built for structured, scalable Baidu SERP data extraction. With intelligent proxy fallback, bulk query support, and precise output fields, it’s ideal for marketers, developers, analysts, and researchers. Export clean JSON/CSV from the dataset or save a consolidated summary to the key-value store for downstream automation. Start extracting smarter Baidu SEO insights and build repeatable workflows for analysis, enrichment, and reporting.

🔍 Baidu Search Scraper

scraper-engine/baidu-search-scraper

Scraper Engine

🔍 Baidu Search Scraper

api-empire/baidu-search-scraper

API Empire

🔍 Baidu Search Scraper

scrapapi/baidu-search-scraper

ScrapAPI

🔍 Baidu Search Scraper

scrapio/baidu-search-scraper

Scrapio

🔍 Baidu Search Scraper

simpleapi/baidu-search-scraper

SimpleAPI

Baidu Search Scraper - 便宜 Cheap 🌐🇨🇳🔎

scrapestorm/baidu-search-scraper---bian-yi-cheap

🔍 Easily Collect Baidu Search Results 🇨🇳 Extract organic search results from Baidu for any keyword, including result URLs, titles, snippets, displayed links, domains & more 🌐📊 Perfect for China SEO research, competitor analysis, brand monitoring, market intelligence & Baidu SERP tracking 🚀✨

Storm_Scraper

Baidu Videos Scraper

searchapi/baidu-videos-scraper

Scrape video search results from Baidu (China's #1 search engine). Extracts video titles, sources, durations, view counts, thumbnails, and embed URLs. Supports pagination and Baidu redirect resolution.

Search API

Baidu Scraper

ivanvs/baidu-scraper

Scrape search results from Baidu without any limits! Export your search result data into XML, JSON, CSV or Excel!

Gen First

1.0

Baidu Videos Scraper - 便宜 Cheap 🇨🇳🔎📺

scrapestorm/baidu-videos-scraper---bian-yi-cheap

🔍 Easily Collect Baidu Video Search Results 🇨🇳🎥 Extract video search results from Baidu for any keyword, including video URLs, titles, thumbnails, duration, sources, publication dates & more 🌐📊 Perfect for China video trend monitoring, content research & Baidu Video SERP tracking 🚀✨

Storm_Scraper

Baidu Search Scraper

jungle_synthesizer/baidu-search-scraper

Scrape Baidu (百度) search results including web, news, and Baike entries. Supports multiple queries per run with pagination, SERP feature extraction, and related query harvesting. Ideal for Chinese-market SEO research, brand monitoring, and AI training data collection.