GOV.UK Content Search Scraper
Pricing
from $22.87 / 1,000 results
GOV.UK Content Search Scraper
Scrape GOV.UK: search the entire UK government publications catalogue (policies, guidance, news, statistics). Filter by query, organisation, format or date. Returns titles, descriptions, URLs, organisations and publication dates.
Pricing
from $22.87 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share

π¬π§ GOV.UK Content Search Scraper
π Search the entire UK government catalogue in seconds. Pull 600,000+ pages, publications, news stories, statistics, consultations, and services straight from GOV.UK. Filter by query, organisation, format, or date. No sign-up, no manual paging, no parser to maintain.
π Last updated: 2026-05-15 Β· π 11 fields per record Β· π 600,000+ pages Β· ποΈ 1,400+ government bodies Β· π 130+ document formats
The GOV.UK Content Search Scraper queries the official UK government content index and returns up to 11 structured fields per record, including titles, descriptions, URLs, document formats, publication dates, and the publishing organisations and world locations. GOV.UK is the canonical home for almost every UK government publication, news story, statistical release, consultation, and citizen service.
The catalogue covers the entire central UK government estate, from the Cabinet Office and HM Treasury to HMRC, the Ministry of Defence, the Department for Transport, and over a thousand executive agencies, arms-length bodies, and tribunals. This Actor makes that data downloadable as CSV, Excel, JSON, or XML in under five minutes. Filters run server-side, so you skip the parser engineering entirely.
| π― Target Audience | π‘ Primary Use Cases |
|---|---|
| Policy analysts, regulatory and compliance teams, journalists, lobbyists, GovTech vendors, academic researchers, market-intelligence firms | Policy monitoring, regulatory horizon-scanning, consultation tracking, FOI release feeds, ministerial speech analysis, statistics release pipelines, press monitoring |
π What the GOV.UK Content Search Scraper does
Six filtering workflows in a single run:
- π Free-text search. Query any keyword or phrase across titles, descriptions, and body text of every GOV.UK page.
- π Format filter. Restrict to a single GOV.UK document format from a list of 130+ (news_story, press_release, guidance, official_statistics, consultation_outcome, statutory_guidance, FOI release, and many more).
- ποΈ Organisation filter. Restrict to a single department or agency by slug (e.g.
cabinet-office,hm-revenue-customs,department-for-transport). - π
Date range filter.
publishedAfterandpublishedBeforescope to any window on the public timestamp. - π’ Sort order. Relevance (default), newest first, oldest first, title A-Z, or most popular.
- π World locations and topical events. Returned per record so you can pivot by country or government event.
Each record includes the document title, description, canonical GOV.UK URL, content ID, format and document type, publication date, every publishing organisation (with slug and acronym), associated world locations, and topical events.
π‘ Why it matters: UK government publications drive regulation, market opportunities, public-sector procurement, and the news cycle. Building your own GOV.UK pipeline means writing a paginated search client, mapping 130+ formats, joining organisation slugs, and refreshing daily. This Actor skips all of that and gives you a clean refreshed snapshot on every run.
π¬ Full Demo
π§ Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded GOV.UK dataset.
βοΈ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
query | string | "vehicle tax" | Free-text search across the full GOV.UK catalogue. Empty = browse all pages of the chosen format / organisation. |
format | string | "" | One of 130+ GOV.UK document formats. Empty = all formats. |
organisation | string | "" | Organisation slug (e.g. cabinet-office, hm-revenue-customs). Empty = all organisations. |
publishedAfter | string | "" | Earliest publication date (YYYY-MM-DD). |
publishedBefore | string | "" | Latest publication date (YYYY-MM-DD). |
orderBy | string | "" | Sort order: relevance, newest, oldest, title A-Z, or most popular. |
maxItems | integer | 10 | Records to return. Free plan caps at 10, paid plan at 1,000,000. |
Example: 50 most recent HMRC press releases.
{"maxItems": 50,"format": "press_release","organisation": "hm-revenue-customs","orderBy": "-public_timestamp"}
Example: every Department for Transport publication mentioning "low traffic neighbourhood" since 2024.
{"maxItems": 200,"query": "low traffic neighbourhood","organisation": "department-for-transport","publishedAfter": "2024-01-01"}
β οΈ Good to Know: GOV.UK formats evolve over time. Older content may carry a legacy
formatvalue while newer content usesdocumentType. Both are emitted per record so you can pivot on whichever is most useful for your analysis. Theurlfield always returns the canonical absolute GOV.UK link.
π Output
Each record carries up to 11 fields. Download the dataset as CSV, Excel, JSON, or XML.
π§Ύ Schema
| Field | Type | Example |
|---|---|---|
π title | string | "Tax your vehicle" |
π description | string | "Renew or tax your vehicle for the first time using a reminder letter..." |
π url | string | "https://www.gov.uk/vehicle-tax" |
π contentId | string | "fa748fae-3de4-4266-ae85-0797ada3f40c" |
π format | string | "transaction" |
π documentType | string | "transaction" |
π
publishedAt | ISO 8601 | "2017-12-07T12:54:39Z" |
ποΈ organisations | array | [{"title": "Driver and Vehicle Licensing Agency", "slug": "driver-and-vehicle-licensing-agency", "acronym": "DVLA"}] |
π worldLocations | array | [{"title": "France", "slug": "france"}] |
π·οΈ topicalEvents | array | [{"title": "Spring Budget 2024", "slug": "spring-budget-2024"}] |
π scrapedAt | ISO 8601 | "2026-05-15T18:29:40.375Z" |
π¦ Sample record
β¨ Why choose this Actor
| Capability | |
|---|---|
| π | Whole-of-government coverage. 600,000+ pages from 1,400+ central government bodies, agencies, and tribunals. |
| π― | Multi-dimensional filters. Query, format, organisation, date range, and sort order combine freely. |
| ποΈ | Organisation joins. Each record names every publishing body with slug and acronym for clean joins to your CRM. |
| β‘ | Fast. 50 pages in seconds, 10,000 records in a few minutes. |
| π | Authoritative source. Cited by policy researchers, lobbyists, and regulatory teams across the UK. |
| π | Always fresh. Every run hits the live catalogue, so your dataset reflects current publications. |
| π« | No authentication. Works with public open-government data. No login needed. |
π Searchable government publications are the foundation of every regulatory horizon-scanning tool, policy newsletter, and procurement dashboard in the UK.
π How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| β GOV.UK Content Search Scraper (this Actor) | $5 free credit, then pay-per-use | 600,000+ pages, 130+ formats | Live per run | query, format, organisation, date range, sort | β‘ 2 min |
| Commercial policy-monitoring platforms | $10k - $100k/year | Comparable + summaries | Daily | Many | π’ Weeks (procurement) |
| RSS feeds per organisation | Free | Limited per feed | Hourly | Few | π Hours (one feed at a time) |
| Manual GOV.UK browsing | Free | Whole site | Live | Same as the website | β³ Forever (no automation) |
Pick this Actor when you want server-side cross-organisation search, structured records, and zero pipeline maintenance.
π How to use
- π Sign up. Create a free account w/ $5 credit (takes 2 minutes).
- π Open the Actor. Go to the GOV.UK Content Search Scraper page on the Apify Store.
- π― Set input. Type a query (or leave empty), pick an organisation or format, set a date window if you need one, and set
maxItems. - π Run it. Click Start and let the Actor collect your results.
- π₯ Download. Grab your dataset in the Dataset tab as CSV, Excel, JSON, or XML.
β±οΈ Total time from signup to a downloaded GOV.UK dataset: 3-5 minutes. No coding required.
πΌ Business use cases
π Automating GOV.UK Scraper
Control the scraper programmatically for scheduled runs and pipeline integrations:
- π’ Node.js. Install the
apify-clientNPM package. - π Python. Use the
apify-clientPyPI package. - π See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly or daily refreshes keep downstream policy monitors and dashboards in sync automatically.
π Beyond business use cases
Open government data powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
π€ Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- π¬ ChatGPT
- π§ Claude
- π Perplexity
- π Copilot
β Frequently Asked Questions
π§© How does it work?
Type a query, optionally pick an organisation or format, click Start, and the Actor pages through the official GOV.UK content index, applies your filters, and emits a clean structured record per page. No browser automation, no captchas, no setup.
π How accurate is the data?
Every record comes from the canonical GOV.UK content index used by the gov.uk website itself, so titles, descriptions, dates, and organisation references match what you see on the page.
π How often is the dataset refreshed?
GOV.UK is updated continuously as departments publish new pages. Every run of this Actor hits the live catalogue.
π Which document formats are supported?
130+ GOV.UK formats including news_story, press_release, guidance, official_statistics, consultation_outcome, statutory_guidance, FOI release, statistical_data_set, transparency, corporate_report, decision, and many more. Use the format filter to scope to one type.
ποΈ How do I find an organisation slug?
The slug is the last segment of the organisation page URL. For example, the Driver and Vehicle Licensing Agency lives at https://www.gov.uk/government/organisations/driver-and-vehicle-licensing-agency, so its slug is driver-and-vehicle-licensing-agency.
π Can I scope to a date range?
Yes. Use publishedAfter and publishedBefore (YYYY-MM-DD). Both are inclusive and run against the document's public timestamp.
β° Can I schedule regular runs?
Yes. Use Apify Schedules to run this Actor on any cron interval (hourly, daily, weekly) and keep downstream policy monitors in sync.
βοΈ Is this data legal to use?
GOV.UK content is published under the Open Government Licence v3.0, which permits commercial reuse with attribution. Review the licence terms for your specific application.
πΌ Can I use this data commercially?
Yes. Open Government Licence v3.0 explicitly allows commercial reuse with attribution. You remain responsible for following the licence terms in your product.
π³ Do I need a paid Apify plan to use this Actor?
No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you scheduling, higher concurrency, and larger datasets.
π What happens if a run fails or gets interrupted?
Apify automatically retries transient errors. If a run still fails, you can inspect the log in the Runs tab, fix the input, and re-run. Partial datasets from failed runs are preserved so you never lose progress.
π What if I need help?
Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.
π Integrate with any app
GOV.UK Content Search Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step monitoring workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get new-publication alerts in your channels
- Airbyte - Pipe GOV.UK pages into your warehouse
- GitHub - Trigger runs from commits and releases
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to trigger downstream actions when a run finishes. Push fresh policy publications into your CRM, or alert your communications team in Slack.
π Recommended Actors
- ποΈ UK Parliament Members Scraper - MPs and Lords with biographies, committees, and contact details
- π£οΈ Hansard UK Parliament Debates Scraper - Full transcripts of Commons and Lords debates
- π‘οΈ OpenSanctions Sanctions & PEP Scraper - 280k+ sanctioned entities and PEPs
- β‘ Carbon Intensity UK Scraper - National Grid carbon intensity forecasts
- π OurAirports Global Airport Database Scraper - 85,000+ airports worldwide
π‘ Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.
π Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
β οΈ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the UK Government, the Government Digital Service, or any UK department. All trademarks mentioned are the property of their respective owners. Only publicly available open-government content is collected, under the Open Government Licence.