GOV.UK Content Search Scraper avatar

GOV.UK Content Search Scraper

Pricing

from $22.87 / 1,000 results

Go to Apify Store
GOV.UK Content Search Scraper

GOV.UK Content Search Scraper

Scrape GOV.UK: search the entire UK government publications catalogue (policies, guidance, news, statistics). Filter by query, organisation, format or date. Returns titles, descriptions, URLs, organisations and publication dates.

Pricing

from $22.87 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

ParseForge Banner

πŸ‡¬πŸ‡§ GOV.UK Content Search Scraper

πŸš€ Search the entire UK government catalogue in seconds. Pull 600,000+ pages, publications, news stories, statistics, consultations, and services straight from GOV.UK. Filter by query, organisation, format, or date. No sign-up, no manual paging, no parser to maintain.

πŸ•’ Last updated: 2026-05-15 Β· πŸ“Š 11 fields per record Β· πŸ“š 600,000+ pages Β· πŸ›οΈ 1,400+ government bodies Β· πŸ“‚ 130+ document formats

The GOV.UK Content Search Scraper queries the official UK government content index and returns up to 11 structured fields per record, including titles, descriptions, URLs, document formats, publication dates, and the publishing organisations and world locations. GOV.UK is the canonical home for almost every UK government publication, news story, statistical release, consultation, and citizen service.

The catalogue covers the entire central UK government estate, from the Cabinet Office and HM Treasury to HMRC, the Ministry of Defence, the Department for Transport, and over a thousand executive agencies, arms-length bodies, and tribunals. This Actor makes that data downloadable as CSV, Excel, JSON, or XML in under five minutes. Filters run server-side, so you skip the parser engineering entirely.

🎯 Target AudienceπŸ’‘ Primary Use Cases
Policy analysts, regulatory and compliance teams, journalists, lobbyists, GovTech vendors, academic researchers, market-intelligence firmsPolicy monitoring, regulatory horizon-scanning, consultation tracking, FOI release feeds, ministerial speech analysis, statistics release pipelines, press monitoring

πŸ“‹ What the GOV.UK Content Search Scraper does

Six filtering workflows in a single run:

  • πŸ”Ž Free-text search. Query any keyword or phrase across titles, descriptions, and body text of every GOV.UK page.
  • πŸ“‚ Format filter. Restrict to a single GOV.UK document format from a list of 130+ (news_story, press_release, guidance, official_statistics, consultation_outcome, statutory_guidance, FOI release, and many more).
  • πŸ›οΈ Organisation filter. Restrict to a single department or agency by slug (e.g. cabinet-office, hm-revenue-customs, department-for-transport).
  • πŸ“… Date range filter. publishedAfter and publishedBefore scope to any window on the public timestamp.
  • πŸ”’ Sort order. Relevance (default), newest first, oldest first, title A-Z, or most popular.
  • 🌍 World locations and topical events. Returned per record so you can pivot by country or government event.

Each record includes the document title, description, canonical GOV.UK URL, content ID, format and document type, publication date, every publishing organisation (with slug and acronym), associated world locations, and topical events.

πŸ’‘ Why it matters: UK government publications drive regulation, market opportunities, public-sector procurement, and the news cycle. Building your own GOV.UK pipeline means writing a paginated search client, mapping 130+ formats, joining organisation slugs, and refreshing daily. This Actor skips all of that and gives you a clean refreshed snapshot on every run.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded GOV.UK dataset.


βš™οΈ Input

InputTypeDefaultBehavior
querystring"vehicle tax"Free-text search across the full GOV.UK catalogue. Empty = browse all pages of the chosen format / organisation.
formatstring""One of 130+ GOV.UK document formats. Empty = all formats.
organisationstring""Organisation slug (e.g. cabinet-office, hm-revenue-customs). Empty = all organisations.
publishedAfterstring""Earliest publication date (YYYY-MM-DD).
publishedBeforestring""Latest publication date (YYYY-MM-DD).
orderBystring""Sort order: relevance, newest, oldest, title A-Z, or most popular.
maxItemsinteger10Records to return. Free plan caps at 10, paid plan at 1,000,000.

Example: 50 most recent HMRC press releases.

{
"maxItems": 50,
"format": "press_release",
"organisation": "hm-revenue-customs",
"orderBy": "-public_timestamp"
}

Example: every Department for Transport publication mentioning "low traffic neighbourhood" since 2024.

{
"maxItems": 200,
"query": "low traffic neighbourhood",
"organisation": "department-for-transport",
"publishedAfter": "2024-01-01"
}

⚠️ Good to Know: GOV.UK formats evolve over time. Older content may carry a legacy format value while newer content uses documentType. Both are emitted per record so you can pivot on whichever is most useful for your analysis. The url field always returns the canonical absolute GOV.UK link.


πŸ“Š Output

Each record carries up to 11 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
πŸ“Œ titlestring"Tax your vehicle"
πŸ“ descriptionstring"Renew or tax your vehicle for the first time using a reminder letter..."
πŸ”— urlstring"https://www.gov.uk/vehicle-tax"
πŸ†” contentIdstring"fa748fae-3de4-4266-ae85-0797ada3f40c"
πŸ“‚ formatstring"transaction"
πŸ“‚ documentTypestring"transaction"
πŸ“… publishedAtISO 8601"2017-12-07T12:54:39Z"
πŸ›οΈ organisationsarray[{"title": "Driver and Vehicle Licensing Agency", "slug": "driver-and-vehicle-licensing-agency", "acronym": "DVLA"}]
🌍 worldLocationsarray[{"title": "France", "slug": "france"}]
🏷️ topicalEventsarray[{"title": "Spring Budget 2024", "slug": "spring-budget-2024"}]
πŸ•’ scrapedAtISO 8601"2026-05-15T18:29:40.375Z"

πŸ“¦ Sample record


✨ Why choose this Actor

Capability
πŸ“šWhole-of-government coverage. 600,000+ pages from 1,400+ central government bodies, agencies, and tribunals.
🎯Multi-dimensional filters. Query, format, organisation, date range, and sort order combine freely.
πŸ›οΈOrganisation joins. Each record names every publishing body with slug and acronym for clean joins to your CRM.
⚑Fast. 50 pages in seconds, 10,000 records in a few minutes.
🌐Authoritative source. Cited by policy researchers, lobbyists, and regulatory teams across the UK.
πŸ”Always fresh. Every run hits the live catalogue, so your dataset reflects current publications.
🚫No authentication. Works with public open-government data. No login needed.

πŸ“Š Searchable government publications are the foundation of every regulatory horizon-scanning tool, policy newsletter, and procurement dashboard in the UK.


πŸ“ˆ How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ GOV.UK Content Search Scraper (this Actor)$5 free credit, then pay-per-use600,000+ pages, 130+ formatsLive per runquery, format, organisation, date range, sort⚑ 2 min
Commercial policy-monitoring platforms$10k - $100k/yearComparable + summariesDailyMany🐒 Weeks (procurement)
RSS feeds per organisationFreeLimited per feedHourlyFewπŸ•’ Hours (one feed at a time)
Manual GOV.UK browsingFreeWhole siteLiveSame as the website⏳ Forever (no automation)

Pick this Actor when you want server-side cross-organisation search, structured records, and zero pipeline maintenance.


πŸš€ How to use

  1. πŸ“ Sign up. Create a free account w/ $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the GOV.UK Content Search Scraper page on the Apify Store.
  3. 🎯 Set input. Type a query (or leave empty), pick an organisation or format, set a date window if you need one, and set maxItems.
  4. πŸš€ Run it. Click Start and let the Actor collect your results.
  5. πŸ“₯ Download. Grab your dataset in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to a downloaded GOV.UK dataset: 3-5 minutes. No coding required.


πŸ’Ό Business use cases

πŸ›οΈ Policy & Regulatory Monitoring

  • Daily horizon-scanning across every UK department
  • Filter by consultation_outcome to track ended consultations
  • Pivot on topicalEvents for budget and statement coverage
  • Build a structured policy-change feed for compliance teams

πŸ“° News, PR & Communications

  • Press monitoring for ministerial speeches and statements
  • Track press releases by department for media briefings
  • Catch FOI releases the moment they go public
  • Power newsroom dashboards with structured GOV.UK feeds

πŸ“Š Statistics & Open Data

  • Pull every official_statistics release for analytics pipelines
  • Track statistics_announcement for upcoming publications
  • Cross-join with publishing organisation for sector views
  • Replace brittle RSS parsing with structured records

πŸ›’ Public Procurement & GovTech

  • Monitor procurement and contract notices by organisation
  • Spot upcoming consultations that signal policy direction
  • Build vendor dashboards from corporate_report releases
  • Feed CRM enrichment with department slugs and acronyms

πŸ”Œ Automating GOV.UK Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟒 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • πŸ“š See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly or daily refreshes keep downstream policy monitors and dashboards in sync automatically.


🌟 Beyond business use cases

Open government data powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

πŸŽ“ Research and academia

  • Quantitative studies of policy publication patterns
  • Public-administration coursework on government output
  • Historical archives of consultations and outcomes
  • Reproducible datasets for political-science research

🎨 Personal and creative

  • Side projects mapping departmental publication volumes
  • Newsletter generators that summarise weekly releases
  • Personal RSS-style feeds across multiple organisations
  • Visualisations of policy themes over time

🀝 Non-profit and civic

  • Civic-tech tools that surface relevant consultations to citizens
  • Investigative journalism on department-level disclosure patterns
  • Watchdog dashboards tracking ministerial communications
  • Accessibility projects that reformat GOV.UK content

πŸ§ͺ Experimentation

  • Train classifiers that auto-tag policy areas
  • Build agent pipelines that summarise daily releases
  • Prototype recommender systems for citizen services
  • Stress-test search infrastructure with real volume data

πŸ€– Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:


❓ Frequently Asked Questions

🧩 How does it work?

Type a query, optionally pick an organisation or format, click Start, and the Actor pages through the official GOV.UK content index, applies your filters, and emits a clean structured record per page. No browser automation, no captchas, no setup.

πŸ“ How accurate is the data?

Every record comes from the canonical GOV.UK content index used by the gov.uk website itself, so titles, descriptions, dates, and organisation references match what you see on the page.

πŸ” How often is the dataset refreshed?

GOV.UK is updated continuously as departments publish new pages. Every run of this Actor hits the live catalogue.

πŸ“‚ Which document formats are supported?

130+ GOV.UK formats including news_story, press_release, guidance, official_statistics, consultation_outcome, statutory_guidance, FOI release, statistical_data_set, transparency, corporate_report, decision, and many more. Use the format filter to scope to one type.

πŸ›οΈ How do I find an organisation slug?

The slug is the last segment of the organisation page URL. For example, the Driver and Vehicle Licensing Agency lives at https://www.gov.uk/government/organisations/driver-and-vehicle-licensing-agency, so its slug is driver-and-vehicle-licensing-agency.

πŸ“… Can I scope to a date range?

Yes. Use publishedAfter and publishedBefore (YYYY-MM-DD). Both are inclusive and run against the document's public timestamp.

⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval (hourly, daily, weekly) and keep downstream policy monitors in sync.

GOV.UK content is published under the Open Government Licence v3.0, which permits commercial reuse with attribution. Review the licence terms for your specific application.

πŸ’Ό Can I use this data commercially?

Yes. Open Government Licence v3.0 explicitly allows commercial reuse with attribution. You remain responsible for following the licence terms in your product.

πŸ’³ Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you scheduling, higher concurrency, and larger datasets.

πŸ” What happens if a run fails or gets interrupted?

Apify automatically retries transient errors. If a run still fails, you can inspect the log in the Runs tab, fix the input, and re-run. Partial datasets from failed runs are preserved so you never lose progress.

πŸ†˜ What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.


πŸ”Œ Integrate with any app

GOV.UK Content Search Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step monitoring workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get new-publication alerts in your channels
  • Airbyte - Pipe GOV.UK pages into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh policy publications into your CRM, or alert your communications team in Slack.


πŸ’‘ Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.


πŸ†˜ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the UK Government, the Government Digital Service, or any UK department. All trademarks mentioned are the property of their respective owners. Only publicly available open-government content is collected, under the Open Government Licence.