CASC SpaceChina Corporate News Scraper
Pricing
Pay per event
CASC SpaceChina Corporate News Scraper
Scrapes press releases and corporate news from SpaceChina.com — the public news portal of CASC (China Aerospace Science and Technology Corporation). Extracts articles from the 集团要闻, 媒体聚焦, and 专题报道 subchannels with full body text, publish date, and subsidiary mentions.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
22 days ago
Last modified
Share
Scrapes press releases and corporate news from SpaceChina.com — the public news portal of CASC (China Aerospace Science and Technology Corporation / 中国航天科技集团有限公司). Extracts structured articles from three news subchannels with full body text, publication date, image URLs, PDF attachments, and automatic CASC subsidiary detection.
What it collects
| Field | Description |
|---|---|
article_id | Unique numeric ID from the article URL (e.g. 4632103) |
subchannel | Source subchannel: 集团要闻, 媒体聚焦, or 专题报道 |
title_zh | Article title in Chinese |
title_en | English title (null — future enhancement via english.spacechina.com mirror) |
body_html | Full article body HTML |
body_text | Full article body as plain text |
publish_date | Publication date (ISO 8601, e.g. 2026-06-11) |
source_url | Canonical article URL |
mentioned_subsidiaries | CASC academy/subsidiary names detected in body (一院 through 八院, CALT, CAST, SAST) |
images | Absolute URLs of embedded article images |
attachments | Absolute URLs of PDF attachments (e.g. annual social-responsibility reports) |
Subchannels covered
| Key | Chinese | English |
|---|---|---|
jtyw | 集团要闻 | Group News — primary launch and operations press releases |
mjjj | 媒体聚焦 | Media Focus — external press coverage aggregated |
ztbd | 专题报道 | Special Reports — themed coverage (missions, events, policy) |
The actor crawls all pages within each selected subchannel, following the site's paginated listing structure automatically.
Input
| Parameter | Type | Required | Description |
|---|---|---|---|
maxItems | integer | Yes | Maximum number of articles to scrape. Set to a high value (or remove the cap) for a full historical crawl (~3,000+ articles across all subchannels). |
subchannels | array | Yes | Which subchannels to include. Accepts any combination of jtyw, mjjj, ztbd. Default: all three. |
Example input
{"maxItems": 100,"subchannels": ["jtyw"]}
Use cases
- Defense and aerospace intelligence — Track every CASC press release mentioning specific launch vehicles, academies, or programs.
- ESG / sanctions screening — Identify CASC subsidiaries (一院 through 八院) named in corporate announcements for mil-civ fusion exposure mapping.
- Trade compliance — Monitor export-control-relevant announcements (new satellite programs, foreign partnerships, dual-use technology disclosures).
- Annual reports — The 专题报道 channel carries annual social-responsibility reports back to 2013 as PDF attachments.
- Research and journalism — Build a full-text searchable archive of CASC's public-facing communications.
Notes
- Chinese-language content: All articles are in Simplified Chinese. The
body_textfield is suitable for NLP pipelines and translation workflows. - English mirror: The
english.spacechina.commirror exists but has minimal content.title_enis alwaysnullin this release. - Subsidiary detection: The
mentioned_subsidiariesfield uses pattern-matching on the body text for the eight CASC academies and their common abbreviations. It is heuristic and may miss references using full official names. - Historical depth: The site retains articles back to at least 2013 across all subchannels, representing the full accessible archive of CASC's public news.