CSRC China Securities Regulator Disclosure Scraper avatar

CSRC China Securities Regulator Disclosure Scraper

Pricing

Pay per event

Go to Apify Store
CSRC China Securities Regulator Disclosure Scraper

CSRC China Securities Regulator Disclosure Scraper

Scrapes enforcement actions, administrative penalties, licensing approvals, regulatory notices, and IPO registration results from China's securities regulator (CSRC — 中国证监会). Covers all disclosure categories including 行政处罚, 行政许可, 监管通知, and provincial bureau announcements.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

22 days ago

Last modified

Categories

Share

Scrapes regulatory disclosures from the China Securities Regulatory Commission (CSRC — 中国证监会) at www.csrc.gov.cn. The CSRC is the principal federal securities regulator for Chinese capital markets — analogous to the SEC in the United States. This actor covers the CSRC news and announcement sections, which include enforcement notices, policy interpretations, press conferences, and administrative actions.

What this actor does

Crawls CSRC static-HTML listing pages and extracts individual disclosure records from detail pages. Each record includes:

  • Disclosure ID, title, and canonical URL
  • Category (证监会要闻, 新闻发布会, 政策解读, and others via startUrls)
  • Publishing date and issuing office
  • Enforcement metadata: penalty type, penalty amount (CNY), case number
  • Violation summary (first substantive paragraph)
  • PDF attachment URL (when present)
  • Source listing URL

Use cases

  • Compliance and AML/KYC screening — Monitor enforcement actions against firms and individuals in Chinese securities markets
  • Sanctions and regulatory intelligence — Track market bans, fines, and license revocations
  • EM equity research — Follow regulatory trends affecting listed companies, brokers, and fund managers
  • Journalism — Monitor CSRC enforcement trends (financial fraud, market manipulation, insider trading)
  • Academic research — Build longitudinal datasets of Chinese securities enforcement

Input

ParameterTypeDefaultDescription
startUrlsarrayCSRC news + press + policy pagesOverride the default listing URLs. Use any common_list.shtml URL from csrc.gov.cn
maxItemsintegerrequiredMaximum number of records to scrape. Set to 0 for no limit

Default categories crawled

SectionURL
证监会要闻 (CSRC News)http://www.csrc.gov.cn/csrc/c100028/common_list.shtml
新闻发布会 (Press Conferences)http://www.csrc.gov.cn/csrc/c100029/common_list.shtml
政策解读 (Policy Interpretation)http://www.csrc.gov.cn/csrc/c100039/common_list.shtml

Custom categories via startUrls

Supply any CSRC listing URL to target specific sections. Pagination is handled automatically.

{
"startUrls": [
{ "url": "http://www.csrc.gov.cn/csrc/c100028/common_list.shtml" }
],
"maxItems": 100
}

Output schema

Each item in the dataset has the following fields:

FieldTypeDescription
disclosure_idstringCSRC internal ID (e.g. c1615676)
urlstringDetail page URL
titlestringDisclosure title (Chinese)
categorystringSection label (证监会要闻, etc.)
issuing_officestring证监会 or provincial bureau
publish_datestringISO-8601 publication date
effective_datestringEffective date (where present)
subject_entitystringNamed subject (company/individual)
subject_rolestringRole of subject
penalty_typestringPenalty types (pipe-delimited)
penalty_amount_cnynumberFine amount in CNY
violation_summarystringFirst paragraph of disclosure text
pdf_urlstringAttached PDF URL
case_numberstringCase reference (〔YYYY〕XX号)
source_urlstringListing page URL
scrapedAtstringISO-8601 scrape timestamp

Technical notes

  • No proxy required — CSRC is a Chinese government portal accessible directly without proxy
  • Static HTML — All listing pages use static pagination (common_list_N.shtml), no JavaScript rendering needed
  • Pagination — Automatically detected from createPageHTML() calls; up to 200 pages per category
  • Enforcement detail pages — Full content is extracted from content.shtml detail pages; metadata (penalty amounts, case numbers) is parsed from body text using regex patterns