Csdn Article Detail Scraper
Pricing
from $3.00 / 1,000 article-scrapeds
Csdn Article Detail Scraper
Scrape full content of CSDN blog articles by URL — title, body text, HTML content, publish date, view count, and tags. Optionally translate the content to English, Indonesian, or any language. No login required.
Pricing
from $3.00 / 1,000 article-scrapeds
Rating
0.0
(0)
Developer
Romy
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
11 hours ago
Last modified
Categories
Share
Scrape full content of CSDN blog articles by URL — title, body text, HTML content, publish date, view count, and tags. Optionally translate the content to English, Indonesian, or any language. No login required.
What does CSDN Article Detail Scraper do?
This Actor fetches the full content of CSDN blog articles from a list of URLs. Each article is parsed server-side — no JavaScript rendering needed. Supports optional auto-translation powered by Google Translate (no API key required).
Use cases
- Content archiving — save full article text for offline storage or analysis
- NLP / AI training data — collect Chinese technical articles as a dataset
- Knowledge extraction — parse structured content from CSDN posts
- International research — read Chinese technical content in your language via built-in translation
- Pipeline integration — use alongside CSDN Article Search Scraper to go from keyword → full article content → translated
How to use
Step 1: Configure input
{"urls": ["https://blog.csdn.net/m0_58523831/article/details/120851261"],"translateTo": "en"}
| Field | Type | Required | Description |
|---|---|---|---|
urls | string[] | Yes | List of CSDN article URLs to scrape |
translateTo | string | No | Target language code for translation. Leave empty to skip. Examples: en, id, ja, ko, de, fr |
Step 2: Run and download results
Click Start and download results as JSON, CSV, or Excel from the Output tab.
Output
One object per URL:
{"url": "https://blog.csdn.net/m0_58523831/article/details/120851261","title": "python从入门到精通——完整教程【转载】","titleTranslated": "Python from beginner to proficient - complete tutorial [Reprinted]","contentText": "文章目录\n一、pycharm下载安装...","contentTextTranslated": "Article directory\n1. Download and install pycharm...","contentHtml": "<div id=\"content_views\">...</div>","publishedAt": "2021-10-19 09:50:05","viewCount": "123456","tags": ["Python", "入门"],"isVip": false,"contentLength": 23399,"translateTo": "en"}
titleTranslatedandcontentTextTranslatedare only present whentranslateTois set.
Pricing
| Event | Price |
|---|---|
| Actor start | $0.05 |
| Per 1,000 articles (no translation) | $3.00 |
| Per 1,000 articles (with translation) | $6.00 |
Examples (with translation):
- 100 articles → $0.05 + $0.60 = $0.65
- 1,000 articles → $0.05 + $6.00 = $6.05
- 5,000 articles → $0.05 + $30.00 = $30.05
FAQ
Does this require login or cookies? No. CSDN serves full article HTML server-side without JavaScript. Plain HTTP requests are sufficient.
Does translation require an API key? No. Translation uses Google Translate via an unofficial free endpoint. No API key or account needed.
Which languages are supported for translation?
Any language supported by Google Translate. Common codes: en (English), id (Indonesian), ja (Japanese), ko (Korean), de (German), fr (French), es (Spanish).
What does isVip mean?
Articles with isVip: true are behind CSDN's VIP paywall. The scraper will still return a partial preview (~500 chars) but not the full content.
What is contentHtml?
The raw HTML of the article body (the #content_views element). Useful if you need to preserve formatting, code blocks, or images.