UK Companies House API Scraper — PSC, iXBRL Accounts, Filings avatar

UK Companies House API Scraper — PSC, iXBRL Accounts, Filings

Pricing

from $2.00 / 1,000 company records

Go to Apify Store
UK Companies House API Scraper — PSC, iXBRL Accounts, Filings

UK Companies House API Scraper — PSC, iXBRL Accounts, Filings

Scrape UK Companies House data via 54 modes: company profiles, PSC ownership graphs, officer networks, iXBRL-parsed accounts (FRS 102/105), filings, charges, gazette notices, and AI director briefs. Pooled API key included — first 20 rows free.

Pricing

from $2.00 / 1,000 company records

Rating

0.0

(0)

Developer

Domin Vo

Domin Vo

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Pull clean, structured data from UK Companies House — without writing a single line of API code. This Actor delivers company profiles, beneficial ownership (PSC) graphs, officer networks, parsed annual accounts, filings, mortgages, charges, and Gazette notices for any of the 5 million+ UK companies on file.

You enter a company number or a name. We handle the rest: rate limits, retries, key rotation, pagination, and parsing. Results land in your Apify dataset as JSON, CSV, NDJSON, Excel — or LLM-ready Markdown — ready for your spreadsheet, BI tool, or AI agent.

What does Companies House API Scraper do?

One Actor, 54 modes, every UK company. Pick a mode (e.g. company_profile, psc_list, ixbrl_accounts), feed it a CRN (09446231 = Monzo Bank) or a company name, and get structured rows back in seconds.

Every row uses the same envelope — sha256, mode, crn, transaction_id, filing_date, payload — so a PSC node and an iXBRL accounts fact ingest the same way downstream.

Three things make this Actor different from a plain Companies House scraper:

  1. PSC ownership graphs, ready-made. Beneficial ownership trees, co-directorship networks, and shared-address clusters — pre-built. No multi-call recursion to write yourself.
  2. iXBRL accounts as typed JSON. UK statutory accounts (FRS 102, FRS 105, UK-IFRS) parsed into 59 canonical statement lines (Revenue, Operating Profit, Total Assets, Cash, …). No PDF wrangling. No other Apify Actor ships this.
  3. Pay only for rows you keep. Pay-per-event pricing from $0.00005 to $0.04. The first 20 rows of every run are free — prototype at zero cost.

First 20 result rows per run are free. No credit card to start.

Why scrape UK Companies House data?

  • KYC & onboarding — verify a company, surface its ultimate beneficial owners, and screen for disqualified directors or insolvency history in one API call. Skip the ComplyAdvantage seat fee.
  • M&A & due diligence — pull parsed accounts year-over-year and compare peers without opening a single PDF.
  • Compliance monitoring — track Gazette strike-off and winding-up notices, late filings, and director changes as they happen via SSE streams.
  • Quant & risk signals — feed PSC churn, charge frequency, and incorporation bursts into your risk model.
  • AI agents — every row carries optional LLM-ready Markdown, so your agent cites Companies House facts with provenance.

How to use the Companies House API Scraper

  1. Click Try for free on the Actor page.
  2. Pick a mode (e.g. company_profile, psc_list, ixbrl_accounts).
  3. Enter a CRN (e.g. 09446231 for Monzo Bank) — or a company name. We resolve it for you.
  4. Click Start.
  5. Grab your data from the Output tab.

That's it. No API key to register. No rate-limit code to write. No proxies to configure.

Or call it from Python:

from apify_client import ApifyClient
client = ApifyClient("<APIFY_TOKEN>")
run = client.actor("dominvo/uk-companies-house-api-ai-scraper").call(run_input={
"mode": "psc_list",
"crn": "09446231",
"maxResults": 50,
})
for row in client.dataset(run["defaultDatasetId"]).iterate_items():
print(row["payload"]["name"], row["payload"]["natures_of_control"])

Input — picking a mode

54 modes, one Actor. Pick the mode that matches what you need:

GoalMode
Company profile, status, SIC codescompany_profile
Search the registercompany_search, advanced_search_companies
Who controls this company? (PSC)psc_list, psc_individual, psc_corporate_entity
Directors and officerscompany_officers, officer_appointments
Disqualified directorsdisqualified_officer_search
Annual accounts (parsed iXBRL)ixbrl_accounts, accounts_section_diff
Filing historyfiling_history, filing_document, filing_chunks
Mortgages & chargescharges_list, charge_detail
Insolvency casesinsolvency_cases
Live change feeds (SSE)companies_stream, filings_stream, psc_stream, charges_stream
Gazette strike-off noticesgazette_notices
Co-directorship & address graphsofficer_network, address_cluster
AI director risk briefai_director_brief, ai_summary
Composite risk scorestrike_off_risk_score, director_risk_score
Bulk register dumps (millions of rows)bulk_company_snapshot, bulk_psc_snapshot, bulk_accounts_archive

Common inputs: crn (8-character UK company number), name (auto-resolved), query (search text), maxResults (cap rows per run), outputFormat (json / csv / xlsx / ndjson / md).

Output — JSON, CSV, Excel, Markdown

Every row uses the same envelope with a mode-specific payload:

{
"sha256": "a3b9c2d1…",
"mode": "company_profile",
"crn": "09446231",
"filing_date": null,
"payload": {
"company_number": "09446231",
"company_name": "MONZO BANK LIMITED",
"company_status": "active",
"date_of_creation": "2015-02-24",
"registered_office_address": {
"address_line_1": "Broadwalk House 5 Appold Street",
"locality": "London",
"postal_code": "EC2A 2AG"
},
"sic_codes": ["64191"],
"has_insolvency_history": false,
"has_charges": false
}
}

Switch outputFormat per run:

  • json — Apify dataset rows (default).
  • ndjson / csv / xlsx — single file streamed to the key-value store at end of run.
  • mdLLM-ready output.md bundle: one ## heading per record, ready to paste into Claude, ChatGPT, or your RAG pipeline. The only Companies House Actor that ships native Markdown output.

Data fields extracted

FieldDescription
sha256Stable record fingerprint — re-run the same input and only changed rows are charged.
modeThe mode that produced this row.
crn8-character UK company number.
transaction_idPSC ID, charge ID, or filing reference (where applicable).
filing_dateDate of the underlying filing, if any.
payload.company_nameRegistered company name.
payload.company_statusactive, dissolved, liquidation, etc.
payload.sic_codesUK SIC industry classification.
payload.kindPSC type (individual, corporate, legal person…).
payload.natures_of_controlHow a PSC controls the company.
payload.scoreComposite risk score (0–100) for signal modes.
payload.financial_highlightsLLM-generated highlights for AI modes.
markdownOptional LLM-ready Markdown rendering of the row.

How much does it cost to scrape Companies House?

The first 20 rows of every run are free. After that, you only pay for what you receive — not for compute time, retries, or rate-limit waits.

Data shapeExample modesPrice per row
Bulk register rowbulk_company_snapshot, bulk_accounts_archive$0.00005 – $0.00025
Stream eventcompanies_stream, filings_stream, psc_stream$0.0003
iXBRL accounting factixbrl_accounts, accounts_section_diff$0.001
Company recordcompany_profile, company_search$0.002
Officer recordcompany_officers, officer_appointments$0.003
Charges / insolvencycharges_list, insolvency_cases$0.004
PSC nodepsc_list, psc_individual$0.005
Composite signalstrike_off_risk_score, director_risk_score$0.005
Filing sectionfiling_history, filing_chunks$0.008
Graph edgeofficer_network, address_cluster$0.01
AI brief / summaryai_director_brief, ai_summary$0.04
Incremental deltaonly when a record changes between runs$0.0003

Worked examples:

  • PSC ownership for 500 companies → $2.50 (minus the first 20 free rows).
  • Parsed iXBRL accounts for 1,000 UK companies → $1.00.
  • AI director brief for one company (~5 LLM outputs) → $0.20.

PSC ownership graphs — beneficial ownership lookup

Find the ultimate beneficial owner of any UK company in one call. Companies House publishes the People with Significant Control (PSC) register — but stitching it into an ownership tree is brittle, multi-call work. We do it for you.

  • psc_list — every PSC declared by a company, with control nature, percentage ranges, and notification dates.
  • psc_individual / psc_corporate_entity / psc_legal_person — typed detail for each PSC kind.
  • officer_network — co-directorship graph: who else sits on the same boards.
  • address_cluster — companies sharing a registered office, useful for shell-company detection.

Drop-in for KYC, AML screening, and ultimate beneficial owner (UBO) workflows.

iXBRL accounts — FRS 102 / FRS 105 / UK-IFRS

Get UK statutory accounts as typed JSON, not PDFs. Most UK companies file iXBRL accounts under FRS 102, FRS 105, or UK-IFRS. We parse them into 59 canonical statement lines (Revenue, Operating Profit, Total Assets, Cash, Liabilities, …) so you can compare and screen without writing an XBRL parser.

  • ixbrl_accounts — full parsed accounts with canonical statement-line tags.
  • accounts_section_diff — year-over-year deltas on the same line items.
  • accounts_estimates — derived size and health metrics.
  • accounts_peer_benchmark — same-SIC peer comparison.

This is the moat: no other Apify Actor ships iXBRL parsing for UK companies.

Companies House streaming API

Subscribe to real-time changes — companies, filings, PSCs, officers, charges, insolvency — via the Companies House Server-Sent Events streams, repackaged as flat dataset rows. Useful for compliance monitoring and change-detection alerts.

  • companies_stream — every new and updated company.
  • filings_stream — every new filing.
  • psc_stream — PSC additions, ceases, and changes.
  • charges_stream, officers_stream, insolvency_stream — narrower feeds.

Pair with the change_detected incremental event: re-run the same input and you only pay $0.0003 per row that actually changed since last time.

FAQ & support

Is this scraping? No. We use the official Companies House REST and SSE APIs with registered application keys.

Do I need my own API key? No. We pool registered keys in the background so you can run the Actor with zero setup. Just pick a mode and click Start.

What is the rate limit? None you need to think about. We pool multiple registered keys and rotate them to give you smooth throughput on bulk runs.

Is the data fresh? Yes — every call hits Companies House live, except the bulk modes which use the official daily snapshots (refreshed every 24 hours).

What formats can I download? JSON, NDJSON, CSV, Excel, and Markdown — switch via the outputFormat input field. Markdown bundles every row into one LLM-ready output.md file.

Can I schedule recurring runs? Yes, via the Apify Schedules tab. Pair with change_detected to pay only for deltas.

Known limits. Scanned image-only filings require OCR and are not yet supported (planned). Documents older than ~2009 have inconsistent metadata at source.

For bugs, feature requests, or a custom data pipeline, open an issue in the Apify Issues tab.


Built on the Apify SDK, Companies House REST + SSE APIs, and an iXBRL parser tuned for FRS 102/105/UK-IFRS. Runs on a 2 GB memory tier.