Japan Kokkai Diet Proceedings Scraper - NDL Speech Records avatar

Japan Kokkai Diet Proceedings Scraper - NDL Speech Records

Pricing

Pay per event

Go to Apify Store
Japan Kokkai Diet Proceedings Scraper - NDL Speech Records

Japan Kokkai Diet Proceedings Scraper - NDL Speech Records

Extract speech records from Japan's National Diet Library (NDL) Kokkai API. Search 1M+ speeches across both chambers and all committees (1947–present) by keyword, speaker, committee, or date. Output includes full Japanese speech text, speaker party, Gregorian and wareki dates, and NDL citation URLs.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 hours ago

Last modified

Share

Japan Diet Kokkai Proceedings Scraper — NDL Speeches

Extract speech transcripts from the National Diet Library (NDL) Kokkai proceedings API — over 1 million speeches from 1947 to the present, covering both chambers and all committees of the Japanese parliament. Returns speaker name, reading (yomi), party affiliation, position, speech text, and meeting metadata. Data is public domain under Article 13 of Japan's Copyright Law.

What does the Kokkai Diet Proceedings Scraper do?

  • Queries the NDL Kokkai proceedings API for speeches from the House of Representatives (衆議院), House of Councillors (参議院), or both chambers
  • Filters by speaker name (partial match), committee name, session number, and date range
  • Returns full speech text plus speaker yomi (reading), party group, and official position
  • Returns PDF links, speech URLs, and meeting URLs for document-level access
  • Exits with zero records and a logged error if no filter is set — the NDL API requires at least one search parameter

What data does it extract?

FieldDescription
speech_idUnique speech identifier
issue_idMeeting issue identifier
sessionDiet session number
chamber衆議院 (House of Representatives) or 参議院 (House of Councillors)
committeeCommittee or meeting name
issue_numberIssue number within the session
meeting_dateMeeting date (ISO 8601)
meeting_date_warekiMeeting date in Japanese imperial calendar format
speech_orderOrder of this speech within the meeting
speakerSpeaker name in Japanese
speaker_yomiSpeaker name reading (hiragana)
speaker_groupPolitical party or group affiliation
speaker_positionOfficial position or title of the speaker
speech_textFull transcript text of the speech
speech_urlURL to the individual speech record
meeting_urlURL to the full meeting record
pdf_urlURL to the meeting PDF where available
search_queryThe search query used to retrieve this record
source_api_endpointNDL API endpoint used

How to use it

At least one filter (searchQuery, speakerName, nameOfMeeting, chamber, sessionNumber, dateFrom, or dateTo) is required. If no filter is set, the NDL API returns an error — the actor will log the error and exit with zero records rather than crashing.

FieldTypeDefaultDescription
searchQuerystringFull-text search query across speech text (Japanese or romaji)
speakerNamestringSpeaker name (partial match). Japanese characters recommended for accuracy.
nameOfMeetingstringCommittee or meeting name filter
chamberstring衆議院, 参議院, or 両院 (both chambers)
sessionNumberinteger0Diet session number. 0 = all sessions.
dateFromstringStart date in YYYY-MM-DD format
dateTostringEnd date in YYYY-MM-DD format
maxItemsintegerMaximum speeches to return

Use cases

  • Quantitative finance and monetary policy research — Track central bank governor testimony: set speakerName to 植田和男 for current BOJ Governor Ueda's speeches, or 黒田東彦 for former Governor Kuroda. Filter by date range and committee to isolate MPM-adjacent parliamentary appearances.
  • Policy trend analysis — Search searchQuery for specific policy terms (fiscal policy, social security, defense) across sessions and chambers to track legislative debate evolution over decades.
  • Political science research — Analyze party-affiliation patterns in speech frequency and committee participation using speaker_group and committee fields across the full 1947–present corpus.
  • Journalism and transparency — Retrieve transcripts of committee hearings on specific legislation by meeting name and date range; link to meeting_url and pdf_url for source verification.
  • Natural language processing — Build Japanese political speech corpora for NLP training using the 1M+ speech dataset — all public domain under Article 13 of Japan's Copyright Law.

FAQ

How do I filter for a specific politician's speeches? Use speakerName with the politician's name in Japanese characters. For BOJ Governor Ueda: speakerName: 植田和男. For former Governor Kuroda: speakerName: 黒田東彦. Partial name matches are supported — the NDL API searches within the speaker name field.

What happens if I run the actor without any filters? The NDL API requires at least one search parameter and returns an error for unrestricted queries. The actor detects this, logs the error with guidance, and exits with zero records rather than crashing the run. Add at least one of: searchQuery, speakerName, nameOfMeeting, chamber, sessionNumber, dateFrom, or dateTo.

Is the speech text complete? Yes. The NDL API returns the full verbatim transcript of each speech as recorded in the official Diet stenographic record. The speech_text field contains the complete text. Some historical sessions (especially early postwar) may have shorter records due to original transcription gaps in the source archive.

Results are available for export in JSON, CSV, and Excel formats from the Apify dataset tab.