Pricing

Pay per event

CUC China Media University — Dance & Performing Arts Scraper

Scrapes faculty rosters, admissions announcements, news, and program pages from Communication University of China (中国传媒大学 / CUC). Covers the dance and performing-arts pipeline that feeds CCTV, Mango TV, and provincial state broadcasters.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

What It Scrapes

Four configurable section categories, all from www.cuc.edu.cn:

Category	Content
`admissions`	招生就业 — enrollment notices, admission policies, exam requirements
`faculty`	Leadership rosters, special collections, departmental faculty pages
`programs`	Academic affairs notices, curriculum docs, departmental announcements
`news`	Main news feed, school of arts culture network, academic exchanges

Articles follow a predictable URL pattern (/YYYY/MMDD/c<channel>a<id>/page.htm). Pagination uses numbered .psp pages. No JavaScript rendering required.

Output Fields

Field	Type	Description
`page_url`	String	Canonical URL of the scraped page
`title`	String	Article or page title (Chinese characters preserved)
`title_zh`	String	Chinese title — identical to `title` for CUC pages
`category`	String	Section: `admissions`, `faculty`, `programs`, or `news`
`publish_date`	String	Publication date as shown on the page (e.g. `2024-05-10`)
`body_html`	String	Full article HTML including embedded content references
`body_text`	String	Plain-text article body
`department`	String	Channel code identifying the originating department
`attachments`	String	PDF/DOC attachment URLs, pipe-separated (admissions docs, curriculum PDFs)
`source_url`	String	Originating article URL
`scrapedAt`	String	ISO-8601 timestamp of the scrape

Faculty name-card pages have body_html and body_text empty by design — the page contains only a name and date. The title and department fields are always populated.

Input Parameters

Parameter	Type	Default	Description
`maxItems`	Integer	10	Maximum number of article pages to scrape (0 = unlimited)
`categories`	Array	all four	Which sections to crawl: `admissions`, `faculty`, `programs`, `news`

Run with maxItems: 0 and all four categories for a full archive crawl. The main news channel alone has hundreds of pages going back several years.

How It Works

The actor uses a hierarchical crawl. It seeds from the section entry points (/zsjy/list.htm, /9996/list.htm, etc.), follows pagination forward, and enqueues every article URL it finds. Article pages get a full extraction pass — title, date, body content, and any PDF attachments.

No proxy required. CUC's servers respond cleanly to datacenter IPs. No Cloudflare. No anti-bot. Concurrency is kept at 5 to stay polite with a university web server.

Use Cases

Chinese-language NLP training corpora (performing arts domain)
Talent-pipeline research tracking which departments feed CCTV and Mango TV
Competitive analysis for Chinese broadcasting education programs
Admissions document archives for research on Chinese university policy

Notes

The site CMS (WebPlus) uses numeric channel codes for some departments. The department field preserves the raw channel code. Map to human-readable names using the CUC department directory as needed.

CUC's admissions section includes PDFs of curriculum plans and judging panel documents for performance programs. These appear as pipe-separated URLs in the attachments field.

Part of the OrbTop Chinese media education dataset — companion to the BDA Beijing Dance Academy Scraper.

CFLAC China Dance Federation News Scraper

jungle_synthesizer/cflac-china-art-federation-dance-news-scraper

Scrapes news, announcements and articles from the China Federation of Literary and Art Circles (CFLAC / 中国文联) dance section — the apex Chinese arts-governance body whose Dance Association administers the Lotus Award and Taoli Cup. Returns article title, date, source, body text and metadata.

BowTiedRaccoon

NDEO College Dance Program Directory Scraper

jungle_synthesizer/ndeo-college-dance-program-directory-scraper

Scrapes the National Dance Education Organization's college dance program directory, job board, and events. Returns institution names, locations, contact info, degree offerings, and faculty counts.

BowTiedRaccoon

Beijing Dance Academy (BDA) News & Announcements Scraper

jungle_synthesizer/bda-beijing-dance-academy-faculty-news-scraper

Scrapes news and announcements from Beijing Dance Academy (北京舞蹈学院, bda.edu.cn) — China's apex classical-dance institution. Covers campus news, official notices, departmental updates. For dance research corpora, diaspora school tracking, and pedagogy studies.

BowTiedRaccoon

University API Scraper - Global University Data

lulzasaur/universityapi-scraper

Scrape university data worldwide. Search by name or country. Get domains, web pages, and location info for thousands of universities.

lulz bot

Dancers Group Audition & Job Listings Scraper

jungle_synthesizer/dancersgroup-audition-listings-scraper

Scrape dance audition and job listings from Dancers Group — a non-paywalled aggregator for the performing arts community. Returns full posting details: description, requirements, deadlines, compensation, and contact info.

BowTiedRaccoon

University Email Checker

tomas.nosek/university-email-checker

Check whether an email belongs to a university domain.

Tomas Nosek

DanceUS.org Studio & School Directory Scraper

jungle_synthesizer/danceus-studio-school-directory-scraper

Scrapes the DanceUS.org directory for US dance studios and schools. Extracts name, address, city, state, ZIP, phone, website, dance styles, and more — up to 10k+ listings.

BowTiedRaccoon

China Listed Company Announcements — 公告 API

nexgendata/china-listed-company-announcements

Monitor China listed-company disclosures and announcements. Clean JSON for analysts, quants and AI agents.

NexGenData

QS World University Rankings Scraper

parseforge/qs-world-rankings-scraper

Pull the QS World University Rankings for institutions worldwide by year or country. Each row returns the rank, institution, country, overall score, academic and employer reputation, citations per faculty, and international faculty and student shares. Built for academic benchmarking.

ParseForge

GymDesk Api Martial Arts & Specialty Fitness

alizarin_refrigerator-owner/gymdesk-api-martial-arts-specialty-fitness

Access GymDesk data for martial arts schools and specialty fitness studios. Get members, classes, belt ranks, attendance, billing, and program data. Ideal for karate, BJJ, taekwondo, kickboxing, yoga, dance, and gymnastics studios.