Hansard UK Parliament Debates Scraper
Pricing
from $23.63 / 1,000 results
Hansard UK Parliament Debates Scraper
Export the official transcripts of UK Parliament debates and speeches from Hansard. Filter by House (Commons or Lords), search term, member, and date range. Each record includes the full speech text, speaker, debate section, and a permalink to the official Hansard transcript.
Pricing
from $23.63 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share

π£οΈ Hansard UK Parliament Debates Scraper
π Export UK Parliament debate transcripts in seconds. Pull every spoken contribution from the House of Commons and House of Lords, filtered by topic, member, date, or department. Each record is a clean structured speech with full text, speaker, debate section, and a permalink to the official Hansard transcript. No sign-up, no manual paging, no parser to maintain.
π Last updated: 2026-05-15 Β· π 17 fields per record Β· π£οΈ Millions of contributions Β· π 250+ years of debates Β· π¬π§ Both Houses
The Hansard UK Parliament Debates Scraper queries the official Hansard transcript catalogue and returns up to 17 structured fields per record, including the contribution ID, speaker name and member ID, House, debate section, sitting date, full speech text, word count, ordering metadata, and a deep permalink back to the official Hansard page.
The catalogue covers the official record of every spoken contribution in the UK Parliament, including ministerial statements, backbench speeches, oral questions, urgent questions, statements, and full debates. Hansard has tracked the proceedings of the UK Parliament since 1803 and is the canonical record cited by historians, journalists, and political researchers.
| π― Target Audience | π‘ Primary Use Cases |
|---|---|
| Political analysts and researchers, journalists, NLP and machine-learning teams, public-affairs and lobbying firms, civic-tech projects, academic political scientists, content creators | Speech and rhetoric analysis, member voting-context research, topic mining, NLP training corpora, ministerial statement monitoring, lobbyist due diligence, civic-tech transparency tools |
π What the Hansard UK Debates Scraper does
Six filtering workflows in a single run:
- π Free-text search. Match a keyword or phrase across every spoken contribution (e.g. "climate change", "NHS funding", "AUKUS").
- ποΈ House filter. Restrict to House of Commons, House of Lords, or both.
- π€ Member filter. Substring match on the speaker name (e.g. "Keir Starmer", "Lord Hannan").
- π
Date range. Scope to any sitting-date window with
startDateandendDate. - ποΈ Department filter. Substring match on the responsible government department (e.g. "Treasury", "Department for Education").
- π’ Page-driven sample. Pull the latest contributions across all topics when no query is set.
Each record includes the contribution ID, the member's name and (when available) their member ID, the House, the section ("Commons Chamber", "Westminster Hall", etc.), the debate section title, the Hansard internal section code, the sitting date, the timecode of the contribution, the full speech text (HTML preserved), a word count, the order in the debate, the paragraph tag, and a deep permalink back to the official Hansard transcript page.
π‘ Why it matters: Hansard transcripts power policy analysis, NLP corpora, civic transparency, and political journalism. Building your own pipeline means writing a paginated client, mapping debate identifiers to permalinks, normalising HTML across sittings, and refreshing daily. This Actor skips all of that and gives you a clean refreshed snapshot on every run.
π¬ Full Demo
π§ Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded Hansard dataset.
βοΈ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
searchTerm | string | "" | Keyword or phrase to search across UK Parliament transcripts. Empty = most recent contributions across all topics. |
house | string | "Both" | One of Both, Commons, or Lords. |
memberName | string | "" | Substring match on the speaker name. |
startDate | string | "" | Earliest sitting date (YYYY-MM-DD). |
endDate | string | "" | Latest sitting date (YYYY-MM-DD). |
department | string | "" | Substring match on the responsible department. |
maxItems | integer | 10 | Records to return. Free plan caps at 10, paid plan at 1,000,000. |
Example: every Commons contribution mentioning "Heathrow" since 2026-01-01.
{"maxItems": 200,"searchTerm": "Heathrow","house": "Commons","startDate": "2026-01-01"}
Example: latest 50 contributions by Keir Starmer in either House.
{"maxItems": 50,"memberName": "Keir Starmer"}
β οΈ Good to Know: the
textfield preserves the original Hansard markup, including column-number<span>tags and inline subscripts. That keeps the record faithful to the official transcript. If you need plain-text, strip HTML downstream once.
π Output
Each contribution record carries up to 17 fields. Download the dataset as CSV, Excel, JSON, or XML.
π§Ύ Schema
| Field | Type | Example |
|---|---|---|
π contributionId | string | "0D8CEA45-19F1-4BF6-83D3-6688C26C01B9" |
π€ memberName | string | "Sarah Olney" |
π€ attributedTo | string | "Sarah Olney" |
π memberId | number | 4591 |
ποΈ house | string | "Commons" |
π section | string | "Commons Chamber" |
π debateSection | string | " Heathrow Airport: Third Runway" |
π debateSectionId | string | "15106B6A-3101-426D-89E3-0544452BD096" |
π hansardSection | string | "CP-CR1" |
π
sittingDate | YYYY-MM-DD | "2026-05-14" |
π timecode | string | "2026-05-14T15:03:57" |
π text | string | "The hon. Gentleman is absolutely right that we need to see the economic case..." |
π’ wordCount | number | 806 |
π’ orderInDebate | number | 6 |
π·οΈ paragraphTag | string | "hs_Para" |
π url | string | "https://hansard.parliament.uk/Commons/2026-05-14/debates/.../HeathrowAirport%3AThirdRunway#contribution-..." |
π scrapedAt | ISO 8601 | "2026-05-15T20:10:51.113Z" |
π¦ Sample record
β¨ Why choose this Actor
| Capability | |
|---|---|
| π£οΈ | Both Houses, full text. Spoken contributions from Commons and Lords with the complete speech body. |
| π― | Multi-dimensional filters. Search term, House, member, date range, and department combine freely. |
| π | Permalinks per row. Every record links back to the canonical Hansard page anchor for citation. |
| π | Historic depth. Indexed transcripts spanning decades of UK parliamentary debate. |
| β‘ | Fast. 100 contributions in seconds, 10,000 records in a few minutes. |
| π | Always fresh. Every run hits the live transcript catalogue, so the dataset reflects the latest sittings. |
| π« | No authentication. Public open-government data. No login needed. |
π Searchable Hansard transcripts are the foundation of every political-journalism dashboard, NLP corpus on UK politics, and lobbyist briefing pack.
π How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| β Hansard UK Debates Scraper (this Actor) | $5 free credit, then pay-per-use | Both Houses, full text | Live per run | search term, House, member, date, department | β‘ 2 min |
| Commercial parliamentary monitoring | $10k - $80k/year | Comparable + voting records | Daily | Many | π’ Weeks (procurement) |
| TheyWorkForYou scraping | Free | Commons-leaning, derived | Daily | Few | π Days |
| Manual hansard.parliament.uk browsing | Free | Whole catalogue | Live | Site-side | β³ Forever |
Pick this Actor when you want structured speech-level records with permalinks and zero pipeline maintenance.
π How to use
- π Sign up. Create a free account w/ $5 credit (takes 2 minutes).
- π Open the Actor. Go to the Hansard UK Parliament Debates Scraper page on the Apify Store.
- π― Set input. Type a search term or member name, optionally pick a House and date range, and set
maxItems. - π Run it. Click Start and let the Actor collect your contributions.
- π₯ Download. Grab your dataset in the Dataset tab as CSV, Excel, JSON, or XML.
β±οΈ Total time from signup to a downloaded Hansard dataset: 3-5 minutes. No coding required.
πΌ Business use cases
π Automating Hansard Scraper
Control the scraper programmatically for scheduled runs and pipeline integrations:
- π’ Node.js. Install the
apify-clientNPM package. - π Python. Use the
apify-clientPyPI package. - π See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly or daily refreshes keep your political monitoring dashboards in sync with each new sitting.
π Beyond business use cases
Open parliamentary transcripts power more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
π€ Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- π¬ ChatGPT
- π§ Claude
- π Perplexity
- π Copilot
β Frequently Asked Questions
π§© How does it work?
Type a search term, optionally pick a member or date window, click Start, and the Actor pages through the official Hansard transcript catalogue, applies your filters, and emits a clean structured row per spoken contribution. No browser automation, no captchas, no setup.
π How accurate is the data?
Every record comes from the official Hansard catalogue used by hansard.parliament.uk itself, so the speech text, member, and debate references match the canonical record line for line.
π How often is the dataset refreshed?
Hansard is updated as sittings are transcribed and published, typically within hours of a debate. Every run hits the live catalogue.
ποΈ Does it cover both Houses?
Yes. Set house to Both (default), Commons, or Lords.
π€ Can I get every speech by a single MP or peer?
Yes. Set the memberName filter to a substring of their name (e.g. "Starmer", "Lord Hannan"). Combine with startDate and endDate for a session-bounded view.
π How far back does the catalogue go?
The official Hansard archive runs back to the early 19th century, with full digital coverage of recent sessions and increasingly complete coverage going back decades.
π Why does the speech text contain HTML tags?
The text field preserves Hansard's original markup (column-number <span>s, inline subscripts, paragraph anchors) so the record stays faithful to the official transcript. Strip HTML downstream if you need plain text.
β° Can I schedule regular runs?
Yes. Use Apify Schedules to run this Actor on any cron interval (hourly during sittings, daily otherwise) and keep your political monitoring dashboards in sync.
βοΈ Is this data legal to use?
Hansard transcripts are published under the Open Parliament Licence, which permits commercial reuse with attribution. Review the licence terms for your specific application.
πΌ Can I use this data commercially?
Yes. The Open Parliament Licence explicitly allows commercial reuse with attribution. You remain responsible for following the licence in your product.
π³ Do I need a paid Apify plan to use this Actor?
No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you scheduling, higher concurrency, and larger datasets.
π What if I need help?
Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.
π Integrate with any app
Hansard UK Debates Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step monitoring workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get debate-mention alerts in your channels
- Airbyte - Pipe transcripts into your warehouse
- GitHub - Trigger runs from commits and releases
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to trigger downstream actions when a run finishes. Push fresh transcripts into your NLP pipeline, or alert your political-research team in Slack.
π Recommended Actors
- ποΈ UK Parliament Members Scraper - MPs and Lords with biographies, committees, and contact details
- π¬π§ GOV.UK Content Search Scraper - Search the entire UK government publications catalogue
- π‘οΈ OpenSanctions Sanctions & PEP Scraper - Sanctioned entities and politically exposed persons
- β‘ Carbon Intensity UK Scraper - National Grid carbon intensity feed
- π° GovTrack U.S. Congress Scraper - U.S. legislative bills and votes
π‘ Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.
π Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
β οΈ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the UK Parliament, the House of Commons, the House of Lords, or the Hansard Society. All trademarks mentioned are the property of their respective owners. Only publicly available open Hansard transcript data is collected, under the Open Parliament Licence.