Baidu Search Scraper avatar

Baidu Search Scraper

Pricing

from $1.80 / 1,000 results

Go to Apify Store
Baidu Search Scraper

Baidu Search Scraper

Scrapes Baidu search results with all major filters and pagination.

Pricing

from $1.80 / 1,000 results

Rating

0.0

(0)

Developer

Lofomachines

Lofomachines

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Baidu Search Scraper | Extract Baidu Search Results

Baidu Search Scraper is a powerful web scraping tool designed to extract search engine results pages (SERPs) from Baidu (China's leading search engine). This scraper is built to reliably bypass captchas, anti-scraping systems, and bot detection mechanisms.

Whether you need to collect search results for SEO monitoring, brand protection, sentiment analysis, academic research, or market intelligence, this Baidu Scraper handles the complexities of pagination.


๐Ÿš€ Key Features

  • Sequential Multi-Query Search: Input multiple search terms (one per line) and process them in a single run. The scraper reuses the browser container for ultra-fast execution.
  • Real Destination URL Resolution: Baidu search results use encrypted redirect links (baidu.com/link?url=...). This scraper automatically follows redirects to extract and output the real destination URLs.
  • Advanced Baidu Search Operators: Fully supports filters such as:
    • Domain filtering (site:domain.com) - supports multiple domains combined with OR.
    • Recency/Time range (last 24 hours, week, month, year).
    • File type filtering (PDF, Word, Excel, PPT, RTF).
    • Language Script Selection (Simplified or Traditional Chinese).
    • Search in page titles only (intitle:) and Exact phrase matching ("phrase").

๐ŸŽฏ Use Cases

  1. Chinese Market SEO Monitoring: Track organic rankings, indexation status, and SERP visibility for keywords on Baidu.
  2. Brand Protection & Infringement Tracking: Search for unauthorized sellers, trademark violations, or fake brand representations on Chinese web properties.
  3. Competitor Intelligence: Analyze competitor landing pages, display domains, and search snippets rank for specific terms.
  4. Academic & Sentiment Analysis Research: Extract historical data, news snippets, and online discussions relevant to Chinese culture, business, or politics.

๐Ÿ› ๏ธ How to Use

  1. Configure Queries: Enter one or more keywords/queries in the Queries / Search Terms input box (one per line).
  2. Define Max Results: Set the maximum number of results you want to retrieve per query (e.g., 100).
  3. Apply Filters: (Optional) Restrict results by site/domain, publication date (time range), language script, or file type.
  4. Enable URL Resolution: Keep Resolve real URLs checked to follow redirects and get the actual target URLs instead of raw Baidu redirect links.
  5. Configure Proxy: For heavy usage, enable the Apify Proxy (using residential proxies is recommended to avoid IP bans).
  6. Run the Actor: Click the Run button. The scraper will collect the data and store it in your default dataset.

๐Ÿ“ฅ Input Configuration

Here is a list of the available input parameters:

Field NameTypeDescriptionDefault
queriesarrayList of search queries to run sequentially (one per line).["claude anthropic"]
maxResultsintegerMax results to collect for each query.100
timeRangestringFilter results by date: any, day, week, month, or year."any"
sitesarrayLimit search to specific domains (e.g. wikipedia.org).[]
filetypestringLimit results to specific file types: pdf, doc, xls, ppt, rtf."any"
languagestringChinese script: any, simplified, or traditional."any"
exactPhrasestringRequire results to contain this exact phrase.""
excludeWordsarrayExclude results containing these words.[]
titleOnlybooleanRestrict search matches to page titles only.false
resolveRealUrlsbooleanFollow Baidu redirect links to get the real target URL.true
proxyConfigurationobjectProxy settings (apify proxy, custom proxies).None

๐Ÿ“ค Output Format

Each scraped search result item is stored as an object in the Apify dataset. The scraper outputs the following fields:

FieldTypeDescription
querystringThe search query term.
positioninteger1-based ranking position of the result for this query.
pageintegerThe page number on Baidu where the result was found.
titlestringThe title of the search result page.
urlstringThe resolved, final destination URL (e.g., https://example.com/page).
baiduUrlstringThe original Baidu redirect URL.
displayUrlstringThe display domain name shown on Baidu.
snippetstringDescription snippet text matching your search terms.
datestringPublication date of the page (if shown on Baidu).
siteNamestringDisplayed name of the website (if shown on Baidu).

Output JSON Example

{
"query": "apple",
"position": 1,
"page": 1,
"title": "Apple (ไธญๅ›ฝๅคง้™†) - ๅฎ˜ๆ–น็ฝ‘็ซ™",
"url": "https://www.apple.com.cn/",
"baiduUrl": "http://www.baidu.com/link?url=6lHipUPotM6NN3efDPvd4gZk1ZSQhtVwsIBdG3DGtmFUBe5LzfEdru89qaxDmtNy",
"displayUrl": "www.apple.com.cn/",
"snippet": "ๆŽข็ดขApple ็š„ๅˆ›ๆ–ฐไธ–็•Œ,้€‰่ดญๅ„ๅผ iPhoneใ€iPadใ€Apple Watch ๅ’Œ Mac,ๆต่งˆๅ„็ฑป้…ไปถใ€ๅจฑไนไบงๅ“,ๅนถ่Žทๅพ—็›ธๅ…ณไบงๅ“็š„ไธ“ๅฎถๆœๅŠกๆ”ฏๆŒใ€‚",
"date": null,
"siteName": null
}

๐Ÿ’ก Troubleshooting & Performance Tips

  • Speeding up Runs: Setting resolveRealUrls to false makes the scraper significantly faster because it doesn't need to make HEAD/GET HTTP requests to every resolved target website. If you only need domain names or the raw Baidu redirect links, turn this off.

โ“ FAQ

Q: Can I scrape thousands of keywords?
A: Yes! You can input a large list of keywords in the queries field.

Q: Why are some destination URLs identical to the Baidu redirect URLs?
A: If the target website is offline, slow to respond, or blocks redirect resolution requests, the scraper falls back to the original Baidu redirect link to ensure you do not lose data.