Csdn Article Detail Scraper avatar

Csdn Article Detail Scraper

Pricing

from $3.00 / 1,000 article-scrapeds

Go to Apify Store
Csdn Article Detail Scraper

Csdn Article Detail Scraper

Scrape full content of CSDN blog articles by URL — title, body text, HTML content, publish date, view count, and tags. Optionally translate the content to English, Indonesian, or any language. No login required.

Pricing

from $3.00 / 1,000 article-scrapeds

Rating

0.0

(0)

Developer

Romy

Romy

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

11 hours ago

Last modified

Share

Scrape full content of CSDN blog articles by URL — title, body text, HTML content, publish date, view count, and tags. Optionally translate the content to English, Indonesian, or any language. No login required.

What does CSDN Article Detail Scraper do?

This Actor fetches the full content of CSDN blog articles from a list of URLs. Each article is parsed server-side — no JavaScript rendering needed. Supports optional auto-translation powered by Google Translate (no API key required).

Use cases

  • Content archiving — save full article text for offline storage or analysis
  • NLP / AI training data — collect Chinese technical articles as a dataset
  • Knowledge extraction — parse structured content from CSDN posts
  • International research — read Chinese technical content in your language via built-in translation
  • Pipeline integration — use alongside CSDN Article Search Scraper to go from keyword → full article content → translated

How to use

Step 1: Configure input

{
"urls": [
"https://blog.csdn.net/m0_58523831/article/details/120851261"
],
"translateTo": "en"
}
FieldTypeRequiredDescription
urlsstring[]YesList of CSDN article URLs to scrape
translateTostringNoTarget language code for translation. Leave empty to skip. Examples: en, id, ja, ko, de, fr

Step 2: Run and download results

Click Start and download results as JSON, CSV, or Excel from the Output tab.

Output

One object per URL:

{
"url": "https://blog.csdn.net/m0_58523831/article/details/120851261",
"title": "python从入门到精通——完整教程【转载】",
"titleTranslated": "Python from beginner to proficient - complete tutorial [Reprinted]",
"contentText": "文章目录\n一、pycharm下载安装...",
"contentTextTranslated": "Article directory\n1. Download and install pycharm...",
"contentHtml": "<div id=\"content_views\">...</div>",
"publishedAt": "2021-10-19 09:50:05",
"viewCount": "123456",
"tags": ["Python", "入门"],
"isVip": false,
"contentLength": 23399,
"translateTo": "en"
}

titleTranslated and contentTextTranslated are only present when translateTo is set.

Pricing

EventPrice
Actor start$0.05
Per 1,000 articles (no translation)$3.00
Per 1,000 articles (with translation)$6.00

Examples (with translation):

  • 100 articles → $0.05 + $0.60 = $0.65
  • 1,000 articles → $0.05 + $6.00 = $6.05
  • 5,000 articles → $0.05 + $30.00 = $30.05

FAQ

Does this require login or cookies? No. CSDN serves full article HTML server-side without JavaScript. Plain HTTP requests are sufficient.

Does translation require an API key? No. Translation uses Google Translate via an unofficial free endpoint. No API key or account needed.

Which languages are supported for translation? Any language supported by Google Translate. Common codes: en (English), id (Indonesian), ja (Japanese), ko (Korean), de (German), fr (French), es (Spanish).

What does isVip mean? Articles with isVip: true are behind CSDN's VIP paywall. The scraper will still return a partial preview (~500 chars) but not the full content.

What is contentHtml? The raw HTML of the article body (the #content_views element). Useful if you need to preserve formatting, code blocks, or images.