Scrape HNW in the UK
Pricing
Pay per usage
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Saad Belcaid
Actor stats
1
Bookmarked
6
Total users
6
Monthly active users
2 days ago
Last modified
Categories
Share
HNW UK — Find British Millionaires by Name, Every Morning
Built for Haris King (SSM) by Saad Belcaid.
This scraper finds UK millionaires. By name. With the company they own. With a one-line reason to call them today.
The dumb-simple version
Every morning, the UK Government publishes the financial accounts of thousands of British companies — for free. Inside those filings: profit, turnover, who owns the shares.
This scraper reads all of it. Then it answers one question: "Whose company made enough profit last year that this person is now rich?"
It outputs a list. One row per rich person. Their name. Their company. Roughly how rich. And one line that says why now.
Three things you should know:
- Same profit, different wealth. A £2M-profit software business makes its 75% owner ~£14M rich. A £2M-profit restaurant makes them ~£5M. Identical profit. The difference is what the market pays per £1 of profit in that industry — 8× for software, 3.5× for restaurants. Industry is the dial.
- The signal is the filing. When a company files fresh accounts, that owner's wealth just got re-papered today. That's why the row exists — not last year, not last month, today.
- Ceased = freshly liquid. When someone was a major owner but no longer is, they sold. They got cash. They're in money-decision mode. The scraper surfaces them too — you'll find these in the column
psc_status = ceased. Most lead-gen actors throw these away. We don't.
UK only. Free data only. No browsers, no proxies, no AI in the hot path.
Max-value playbook (do this in order)
Day 1 — first run, get volume
companiesHouseApiKey: <your free CH key>backfillDays: 14 # process the last 2 weeks of filings in one runminPersonalWealth: 5000000 # default £5M floor# everything else: defaults
Hit run. Wait ~15 min. Expect 200–500 rows in your dataset. That's your first batch.
Day 1, hour 2 — triage by signal type
Open the dataset. Sort or filter by psc_status. Three lists, in this order:
| List | Filter | Why first |
|---|---|---|
| Hot — fresh exits | psc_status = ceased AND ceased_on in last 12 months | They sold, they got paid, they're picking advisors right now. Call this week. |
| Warm — top wealth band, decision-makers | estimated_personal_wealth_band ∈ {£25M+, £50M+, £100M+} AND ownership_band = 75-100% | Top of the pyramid. Sole/majority owners are the decision-maker, not a junior partner. Call within the month. |
| Steady — your industry, sole/majority | targetSicPrefixes = your sector AND ownership_band ∈ {50-75%, 75-100%} | Bread-and-butter. Sequence them across your normal cadence. |
Day 2 onwards — daily drip
companiesHouseApiKey: <your key>backfillDays: 0 # today only# everything else: same defaults
Set up a Schedule with cron 0 8 * * 2-6 (8am UTC, Tue–Sat). The actor's persistent dedup means only NEW people surface every morning — typically 10–40 fresh leads per day. Steady drip into your CRM with zero overlap with yesterday's batch.
Filter recipes (copy-paste)
"I want freshly-liquid Brits this month"
pscStatus: ceasedukResidentOnly: truebackfillDays: 30minPersonalWealth: 5000000
"I want central London £10M+ founders"
wealthBands: ["£10M+", "£25M+", "£50M+", "£100M+"]ownershipBands: ["75-100%", "50-75%"]regionPostcodePrefixes: ["SW", "W1", "EC", "WC", "N1", "NW1"]
"I want UK SaaS / IT founders"
targetSicPrefixes: ["62", "63"]ownershipBands: ["75-100%"]minPersonalWealth: 5000000
"I want pharma / biotech / healthcare sellers (recent exits)"
targetSicPrefixes: ["21", "72", "86", "87"]pscStatus: ceasedbackfillDays: 60
"I only care about ultra-HNW (£25M+)"
minPersonalWealth: 25000000ownershipBands: ["75-100%", "50-75%"]
"I want Manchester / NW England HNW"
regionPostcodePrefixes: ["M", "BL", "OL", "SK", "WN"]
"I want Scottish HNW"
regionPostcodePrefixes: ["EH", "G", "AB", "DD", "PA", "FK", "KY", "ML"]
Why "ceased = gold" deserves its own paragraph
Most actors filter ceased PSCs as "old data." They drop them. We don't.
A ceased PSC is someone who held significant control of a profitable UK business and no longer does. There's basically one reason that happens: they sold. And in the 6–18 months after a sale, that person is doing every one of these things:
- Picking a new wealth manager
- Buying property (sometimes multiple)
- Setting up a family office or trust structure
- Looking at their next venture or angel deals
- Giving to charity strategically (tax-driven)
- Restructuring inheritance plans
If you sell any of those services — wealth management, real estate, tax/legal, M&A advisory, charity advisory, business consulting, executive coaching, family office services — ceased PSCs from the last 12–18 months are the warmest lead category in your dataset. Not just warm — time-sensitive. Their decisions get made, they sign with someone, and the window closes.
Use pscStatus: ceased + filter the dataset by ceased_on year ≥ (this year − 1). That's your urgent-call list.
Output — one row per HNW individual
Every row has these fields. The ones in bold are the ones you'll use most.
| Field | Example |
|---|---|
person_name | John Smith |
psc_status | active (still owns) or ceased (sold — call urgently) |
psc_relationship | direct (PSC of this co) or via_chain (PSC of a parent holdco) |
ceased_on | 2024-07-06 (date they exited, if ceased) |
nationality | British |
country_of_residence | United Kingdom |
ownership_band | 75-100% |
control_summary | "ownership of shares 75 to 100 percent, voting rights 75 to 100 percent" |
holding_chain | ["Acme Holdings", "Acme Topco"] (set when discovered through a parent chain) |
company_name | Acme Software Ltd |
company_number | 12345678 |
registered_address | 123 High Street, Manchester, M1 1AA |
locality | Manchester |
postal_code | M1 1AA |
country | England |
sic_codes | ["62012"] |
sic_descriptions | ["Business and domestic software development"] |
industry_multiple_applied | 8.0 |
company_ebitda | 4,200,000 |
company_turnover | 18,500,000 |
company_employees | 45 |
estimated_personal_wealth | 29,400,000 |
estimated_personal_wealth_band | £25M+ |
signal | "Majority/sole owner of Acme Software Ltd — business and domestic software development, £4.2M EBITDA, fresh filing 2026-04-29" |
filing_date | 2026-04-29 |
direct vs via_chain
When you see the same person in two rows for two related companies (e.g. an operating co AND its holding co), one row will be direct and the other via_chain. The direct row is the canonical one — it's where the person is actually listed as PSC. The via_chain row was found by tracing through ownership.
Keep both for full coverage; or filter psc_relationship = direct to de-double-count.
How we estimate wealth (in plain English)
One rule. Three inputs. No AI, no magic.
wealth = profit × industry multiple × % they own
Translated:
"How much profit did the business make × how much someone would pay to buy a business like that × what slice of it this person owns."
That's the back-of-envelope every UK M&A advisor scribbles before they pick up the phone. It's not a valuation. It's a triage signal — good enough to know who's worth calling.
What's a "multiple"? (the SIC story)
When someone buys a private business, they pay a multiple of its yearly profit. Software gets 8×. Restaurants get 3.5×. Why? Because software profit is recurring and scalable. Restaurant profit shows up if you're standing in the kitchen at 11pm. The market knows the difference and pays accordingly.
To pick the right multiple per company, we use the company's SIC code — the UK Government's industry tag, declared on every annual filing (5 digits, e.g. 62012 = software, 56101 = restaurants). The scraper matches the SIC code against an internal table of ~85 industry prefixes and uses the most specific match.
Same profit, very different millionaire
| Owner | Industry | Profit | Multiple | Stake | Estimated wealth |
|---|---|---|---|---|---|
| Software founder | 62012 (software) | £2M | 8× | 75% | £14M → £10M+ band |
| Restaurateur | 56101 (restaurant) | £2M | 3.5× | 75% | £5.25M → £5M+ band |
| Restaurateur (50% partner) | 56101 (restaurant) | £2M | 3.5× | 37.5% | £2.6M → £1M+ band, dropped |
Identical revenue. Identical profit. The difference is which industry they're in and how much they own.
Industry multiples — selected examples
(Full coverage of ~85 SIC prefixes lives in src/wealth.py. Lookup matches the LONGEST prefix, so 62012 software gets 8× even though 62 IT services is 6.5×.)
| Sector | Multiple | Why |
|---|---|---|
Biotech / pharma R&D (7211) | 9.0× | IP-heavy, clinical assets, pharma buyer demand |
Fund management (6630) | 8.5× | Recurring AUM fees, sticky |
Software development (6201) | 8.0× | SaaS scaling economics |
Insurance brokers (6622) | 8.0× | Subscription-like renewals |
Pharma manufacturing (21) | 8.0× | Regulated, branded |
Data / cloud / hosting (6311) | 7.5× | Cloud-services premium |
Specialist medical / dental (8622, 8623) | 7.5× | Private healthcare consolidation |
Hospitals (861) | 7.0× | Recurring care revenue |
Insurance (65) | 7.5× | High recurrence |
Veterinary (75) | 7.0× | Active consolidation play |
Medical instruments (3250) | 7.0× | High-margin specialty mfg |
Pre-primary education (8510) | 7.0× | Nursery roll-up trend |
IT consultancy (6202) | 6.5× | Tech services premium |
Banking (641) | 5.5× | Regulated, capital-heavy |
Mgmt consultancy (7022) | 6.0× | Repeat-engagement |
Real-estate letting (6820) | 5.5× | Rental recurrence |
Wholesale (46) | 4.5× | B2B recurrence |
Manufacturing (general) (28, 29) | 5.5× | Capital-heavy operations |
Construction (41–43) | 4.0–4.5× | Project-based, cyclical |
Retail (47) | 4.0× | Margin pressure |
Restaurants (56) | 4.0× | Fixed-cost squeeze |
Pubs (5630) | 3.5× | Toughest |
| Default fallback | 4.0× | Used when SIC isn't in the table |
These are conservative. Real M&A deals trade above and below these depending on growth, recurrence, scale, customer concentration. We tune low on purpose — the doctrine is under-estimate wealth, never over-estimate. A £14M owner-operator on this scoring is, in reality, almost always richer than that.
Ownership-band midpoints (from Companies House PSC API)
| Band | What we use |
|---|---|
| 75–100% | 0.875 |
| 50–75% | 0.625 |
| 25–50% | 0.375 |
Companies House only reports owners who hold at least 25% of the shares or voting rights. Anyone who shows up holds at least a quarter of the company.
Wealth bands the scraper emits
<£1M, £1M+, £5M+, £10M+, £25M+, £50M+, £100M+
The default floor is £5M. Raise it to £10M+ if you only want serious money. Drop it to £1M+ if you want a wider net of owner-operators. Or use wealthBands to keep specific bands only.
Input options
| Field | Default | Description |
|---|---|---|
companiesHouseApiKey | (required) | Free REST key from CH developer portal |
| Wealth filters | ||
minPersonalWealth | 5000000 | Wealth floor in £ |
wealthBands | [] | Restrict to bands. e.g. ["£25M+","£50M+","£100M+"] |
ebitdaPreThreshold | 0 | Auto-derived from minPersonalWealth when 0. Override only for tuning. |
| Person filters | ||
ownershipBands | [] | e.g. ["75-100%"] for sole/majority owners only |
pscStatus | all | active / ceased / all. Use ceased for fresh-exit lists. |
ukResidentOnly | false | Drop non-UK residents |
| Geographic filter | ||
regionPostcodePrefixes | [] | e.g. ["SW","W1","EC"] for central London |
| Industry filters | ||
targetSicCodes | [] | Exact 5-digit SIC matches |
targetSicPrefixes | [] | SIC prefixes, e.g. ["62"] for IT |
| Date range | ||
backfillDays | 0 | Process last N days in addition to today (max 60) |
bulkFileDate | (today) | Override the end date YYYY-MM-DD |
| Advanced | ||
maxCorporatePscHops | 3 | How deep to trace holding-co ownership chains |
skipWeekendCheck | false | Run anyway on Sun/Mon (no-op since CH won't have published) |
Use cases by industry
Wealth-management & private-banking outreach
minPersonalWealth: 10000000 + pscStatus: ceased for the past 12 months. These are the people picking a new advisor right now.
Estate planning / tax / legal
minPersonalWealth: 25000000 + ownershipBands: ["75-100%"]. Patient outreach, but high-LTV.
Family offices
minPersonalWealth: 50000000 + pscStatus: all. UK ultra-HNW universe is ~12,000 individuals; this surfaces the ones whose wealth is fresh-papered today or who just exited.
Real estate / property advisory
pscStatus: ceased + regionPostcodePrefixes matching where you sell. Recent exits buy second/third homes; they self-identify by location.
Connector OS Station
Pipe the dataset into Station as the demand side. The signal field plugs straight into the I Layer for evaluation against your supply network (advisors, brokers, vendors).
Flow: scrape → dataset → paste dataset ID into Station → match against your supply → scored introductions.
How it works under the hood
- Parses XBRL filings using the UK Government's own
stream-read-xbrl - EBITDA = operating profit + depreciation (conservative floor estimate)
- Rate-limited to stay within Companies House's 600 req/5min limit
- Normalizes UK CH company numbers (zero-pad, strip prefixes) so malformed bulk-feed values don't lose leads
- Resolves corporate PSC chains by
registration_number, falling back to exact-name search when CH didn't store the number — so PE/holdco structures still surface real humans - Tags each row
directvsvia_chainso operators can de-double-count when an operating co AND its holdco both file separately - Surfaces both active and ceased PSCs — the ceased ones are the highest-value HNW signal in the dataset
- Strips honorifics (
Mr/Mrs/Dr/Sir) before deduping so the same person doesn't appear under multiple titles - Stores
(person + company)keys in a persistent KV store so duplicates never reappear day-over-day
Costs
- Companies House API: free (600 req / 5 min)
- Apify compute: ~10–15 min per run × 1024 MB ≈ trivial CU on the standard plan
- Multi-day backfill (
backfillDays > 0): N× the per-day cost
Built by Saad Belcaid for Haris King's SSM workflow. Data sourced from Companies House (public record). Wealth estimates are heuristic, not valuations.