Joel, I don't regard myself as especially old at 69, given that my Dad survived to 99, but I was attending protest marches outside the US Embassy against the bombing of Cambodia when I was a student, (eh-hem, 50 years ago, LOL) and I see no especial reason to stop thinking or protesting nowadays: after all, I have zero to lose!
I don't do politics or controversy on my own substack: but you are welcome to subscribe if you enjoy gentle things like wild birds, Scottish culture, scenery, adventure, and foraging for mushrooms. We all need that too!
Many thanks for this detailed analysis. I must ask a question regarding this point: "The most fundamental question of all is whether any people trying to match identifying details in different datasets were aware of the vaccination status of the person who had died when they did that?"
Taking into account that ONS refused to provide any data for seven months and then returned with data arranged into new categories that do not line up helpfully with the historical data, the most parsimonious explanation appears to be that ONS are intentionally gerrymandering the data. What I still do not quite understand is why they would do this - they are not funded from pharmaceutical money like The British Heart Foundation, who have a fiduciary motive for their deceit, and they cannot protect the government from criticism because the 'opposition' party is even more rabidly pro-magical talismans than the government and wouldn't dream of calling them on this topic.
What motive do the ONS have to be purposefully distorting the presentation of this data? A few candidates are worth mentioning:
1. ONS contains a sufficient number Covidian true believers who are certain the vaccines work and thus consciously or unconsciously skew the data towards the result they expect or want
2. ONS is afraid of causing confidence in vaccination to slip even further than the recent nonsense already did, and is therefore intentionally engaging in a pernicious 'noble lie'
3. There is pressure on the ONS from somewhere else (where...?)
I suspect this is more of a question for the other people loitering in the comments than for you, Dr Craig. However, let me take the opportunity to thank you for everything you have been doing since 2020. It is greatly appreciated!
It is almost impossible to work within the UK establishment and its many quangos unless you 'toe the line'. You see the same trend in the BBC, the Police, the Judiciary, and the Media: all have their snouts in the same trough, which is the money that they extract from us, the hapless taxpayers.
John Dee's part 3 on the ONS data ruminates some on this. I lack the knowledge to succinctly summarize it here, so I recommend checking it out on his substack.
I wish I could fully understand all of this, that's my fault, but I get the gist of it.
Wouldn't it be simple if one had accurate details of the entire UK population, total numbers and total numbers in age bands.
Then, voila - deduct the ever jabbed in each band (mRNA/Astra) ie- the NIMS total?
Remaining balance are the never jabbed.
Add up the deaths in each group.
Even I could maybe understand that.
Thank you Clare and others for all your hard work and no little courage over the last three years or so. Keep going and never forget, there are far more of us than they would have you believe. And, importantly, we have right on our side.
I look forward to your next appearance on the BBC when you can present your findings to a wider audience who might learn something valuable.
I'm sure the invitation is in the post...along with Mike Yeadon's, Carl Heneghan's, Aseem Malhotra's etc.
But that's the problem! There is no such accurate record for any country (except maybe North Korea?!) so any analysis that relies on a population denominator has to be so heavily caveated as to be useless for making any reliable population-wide inferences. No-one told the MSM this though! Or, if they did, they deliberately ignored that because the headline supporting the narrative is all that matters, not that people can make properly informed decisions about their health.
Some arguments for 'better data needed' could play into the hands of those who want compulsory digital IDs. The deadline for the 'consultation' on this is very shortly.
Most countries do have compulsory ID cards. The Common Law and stress on individual rights seem to have slowed the trend here. It's hilarious though if supermarkets know the UK population more accurately, i.e. from food sales, than the Home Office does.
On the whole, I think the existing UK data bases kept on people are perfectly adequate. HMRC deals with taxpayers, DWP deals with state pension recipients, DVLA deals with drivers, etc. If people weren't becoming more alienated from the police and other authorities, more people might fill in the ONS census form. 'The state' has only itself to blame.
U.K. population is much larger than they will openly admit, pressure on the infrastructure and the none stop building tells us it far exceeds the numbers we are fed.
The amnesty announced yesterday by Sunak will have the numbers swell even more by the ones who arrived and have remained hidden. You can expect none of these of over million that came through legal channels just last year will have been vaxxed, in some countries where they’re from it was never pushed as it was in western countries
We are losing people too: many EU citizens are leaving due to Brexit or because their own countries are booming, many UK pensioners want to live in the sun for six months a year, many young Brits are working in California, Florida, Canada, New Zealand or even China.
My daughter has been working in New Zealand for the last decade: I dont suppose she 'deregistered' from the local GP clinic, and we still get bank letters for her dormant account: so she is another 'ghost;' in the system.
Hi Dr. Craig, I think I can shed some light on the cryptic sentence,
“We linked deidentified Census 2021 records to NHS numbers using the personal demographics service to obtain NHS numbers for census identification numbers.”
IT systems of late try to protect Personally Identifiable Information (PII) by keeping it in a single database for the whole enterprise, which is only accessible through a protected service. The protected service ("personal demographics service", perhaps) issues a generated key which other systems can store in their own database, and use to access (but not store) the PII.
So, if someone were to hack the ONS database, they would not get all the personal demographic info as well.
This begs the question (if this is the correct interpretation), why does the ONS claim they cannot release the raw data due to the PII in it? (as if they couldn't simply leave out the PII in the first place).
That would imply that they must have written rules for matching and there would be little flexibility in terms of matching records with slight differences. Is that a fair interpretation?
I checked around to see if my guesswork above seemed reasonable. The cryptic sentence is indeed about hiding personal identity, but it works differently than I guessed and seems fairly convoluted. (Also, now I can't see how the leap is made from de-identified to identified either :)
Suffice it to say I don't have a complete picture. So it's not really feasible to conjecture on who matched the census data to NHS records, when, or how.
Still, it seems quite likely that the method - whatever it is - is automated, i.e. cast in concrete, in software code. I found several references to automated matching routines/calls to link ONS to NHS data. And the dataset to be matched is huge. So I think matching is likely to be automated, and potentially in a way that is standardized across many ONS statistics teams.
Automation means little flexibility in terms of results.
Even though there may be flexibility and complexity in the matching *method*, if it's automated it should work the same way every time (barring bugs, or AI :).
I will post some detailed notes and links in a separate comment, in case they might be of interest.
Notes on what I found, for the record. Feel free to ignore!
*** NHS Side
The Personal Demographics Service (PDS) belongs to NHS Digital. It's also used by doctors and hospitals to get patient information. It contains personal patient data (including many personal identifiers), but not medical records.
Neither of these would explain our cryptic sentence!
*** ONS Side
The cryptic sentence may be talking about de-identification in the result, rather than in the source. Note the context just before it:
"The Census 2021 linked dataset is based on the population in the Census 2021 ... We linked deidentified Census 2021 records to NHS numbers using the personal demographics service to obtain NHS numbers for census identification numbers."
I gather that by "Census 2021 linked dataset", they are referring to their working analysis dataset. In other words, it contains de-identified data extracted from the full Census 2021, linked to NHS number from PDS. I saw this "linked dataset" terminology in many other descriptions. Some had several sources, so it was perhaps a little more clear.
So this could simply be a case of confusing wording. For example it could be trying to describe a process like this:
1) Link each census identification number to an NHS number using NHS Personal Demographic Service (eg, "search"), and passing identifying criteria from the full Census 2021 dataset.
2) Stuff the matches away in a temporary result.
3) De-identify the Census 2021 data, join it to our temporary result with NHS number, and extract it all out as our analysis dataset.
Or it could be describing a very complicated architecture with all kinds of components that I would rather not imagine. This kind of approach is hinted at on this older web page:
"We are supplied PDS data as an extract from the PDS system"
In the first case, NHS# would likely be linked at the time the death by vaccination status dataset is created. In the second case, NHS# might have been linked back when the full 2021 Census data was loaded (and perhaps on an ongoing basis as well).
Either way, here is the general matching policy for linked datasets, which may be of interest:
"Datasets used for statistical research are usually de-identified. The approach taken by the ONS for requests to include personal information in a research project is covered in the "Statistical research using personal information" section of this policy.
There is a clear separation of duties meaning that staff handling personal information for data processing (for example, ingestion, data engineering) are not actively involved in statistical research. ONS staff conducting statistical research do not have access to personal information."
"As part of the ONS's commitment to the highest ethical standards, we are taking a leading role in safeguarding confidentiality by ensuring the efficacy and proportionality of controls applied to de-identify data."
"When working with de-identified data is feasible, a proportionate level of de-identification ensures that ONS staff and accredited researchers can access the data they require without directly or indirectly disclosing the identity of individuals (re-identification)."
1) Example description of another linked dataset from the ONS website, a little clearer I think:
"Age-standardised mortality rates are calculated for vaccination status groups using the Public Health Data Asset (PHDA) dataset. The PHDA is a linked dataset combining the 2011 Census, the General Practice Extraction Service (GPES) data for pandemic planning and research, and the Hospital Episode Statistics (HES). We linked vaccination data from the National Immunisation Management Service (NIMS) to the PHDA based on NHS number, and linked data on positive coronavirus (COVID-19) Polymerase Chain Reaction (PCR) tests from Test and Trace to the PHDA, also based on NHS number."
(The link is actually given in the *Deaths by vaccination status Dec 2022* spreadsheet - on the Notes tab, cell [B4]. Might be a mistake though, doesn't seem to fit.)
2) Proposed ONS dataset to roll Census 2021 data forward into the future
Interesting for context, terminology, and ambition.
"The PDS system is the master demographics database for the NHS in England, Wales and the Isle of Man. It is the primary source of information on a patient's NHS number, name, address and date of birth. It does not hold any clinical information. The master database contains approximately 74 million patient records. Records are created for newborns or when a patient makes contact with an NHS service, primarily by registering with a General Practitioner (GP) practice, but also through accessing A&E or attending hospital. The PDS is used by NHS organisations and enables a patient to be readily identified by a healthcare professional to quickly and accurately obtain their correct medical details."
That could certainly account for some of the wild deviations in rates.
It seems if they ever did give out "the raw data", it would already have all of these mismatch issues built in. In a sense there is no "raw data", because they are cobbling it together from disparate sources, and the result will already include a lot of assumptions about matching, errors, etc.
Thank you for all you do, and the clarity. I'm delighted that you have spent the time to detail how one can become a ghost, preferably without dying! Love it! 👏👏👏
Great review. The data is very weak by design or incompetence. Perhaps the greater question that isn't being asked is "Why are the data systems universally so bad?" Money certainly wasn't an issue. Didn't Fauci have 40 years to plan Covid data systems? Didn't Fauci have a year to put in place a state of the art vaccination total tracking system? Weren't there many many Pandemic stimulations that could have developed pandemic data systems? Many funded by a guy name Gates which I believe had something to do with data systems. Weren't the Masters of the Data Universe (Social, Big Tech, Deep State Tech) all on board for controlling covid data? Isn't data systems their baby? Didn't Google come in and save Obamacare's website? Where were they with the Covid data?
Why are the Covid data systems so bad when they had so much time, money, technical ability, and global planning to make them right? That is the question.
Again great review. I figure just like a ruler can't measure itself, people can't always either. One good metric of who you are is your friends. If you have great friends, you must measure up. In looking at who likes your work, you have really good group of professional admirers. Keep up the good work.
Surely we/they should establish a baseline pre-vaccines of the general health (i.e. ER visits, disease records etc.) of the two groups, i.e. person who did vs. did NOT get at least one covid vaccine?
Matthew Crawford has written extensively about this and it is very clear from looking at flu vaccine that the healthy vaccinee effect could explain the totality of positive effects observed.
Wow. This is much more complicated than I imagined.
Another couple of reasons to distrust the data. Firstly, the graphs don't seem to exhibit the normal seasonal pattern of mortality. Secondly, the death rates tail off in 2022, particularly at the younger ages and this may be because of some delays in notification. John Dee wrote about this in his cause of death analysis.
Suppose the ONS has received a steer from above to avoid producing numbers that make the vaccines look bad. One way to do that is to produce a range of draft "findings" based on a range of alternative but plausible ways to do the calculations. Then from this range, the draft findings that make the vaccines look best could be selected, and rationalisation for the way the calculations were done put in place.
Peter Watt they’ve made it clear as mud, if there was any real benefit they roll out those charts like in he early days of rollout and bombard the public with the information.
Their silence speaks volumes since it was dropped in favour of Ukraine a year ago.
The people in the government who are getting protection are not just the elected polititians but the civil servants, nudge unit members and hospital officials ... the people who lied, and the people who gave the orders to lie.
1) Please keep reminding us of the definition of "vaccinated" which has a two or three week lag from the date of injection, or a few weeks if you require two shots to be defined as "vaccinated."
2) In presenting your graphs for the older cohorts where the "unvaxxed" have higher mortality rates than the "vaxxed," it is helpful to recall that those who are about to die are usually excused from the Covid injections. Otherwise the Vaxx-fans use them to "prove" that we need to inject the elderly.
Brilliant work, thank you. Intentional or unintentional, this bias makes the data useless. Not that that will ever be noticed by the media and official channels, of course. Which makes it all very convenient I'm not a believer in such coincidences.
I'm also wondering do we still need to be concerned regarding the definition of "unvaccinated" wrt mortality? We know that those vaccinated but in the window of (variously) 10, 21 and 30 days after inoculation were in many official data sets reported as "unvaccinated" (Like being only "a little bit pregnant"?)
There is little doubt that the ONS have been skewing their reports on deaths with regards to covid and impact of the pseudo vaccines on all cause mortality. As Clare points out it gets so absurd that if these data are to be accepted then the covid vaccines apparently stop people dying from causes other than covid.
The ONS report Clare refers to includes a Table 4 which lists mortality rates by age group and vaccination status. The month of March 2022 is the newest data. This has a section for Non-COVID 19 Deaths.
It is peculiar to say the least that for 40-49's the share of the population (based on the people-years proportions for the whole age group) is 10.4% for the unvaccinated but the share of deaths is 13.4%
A similar oddity exists for the 50-59s where the figures are 6.2% of the population with 9.3% of the deaths. For the 60-69s it's 4.1% of the population and 6.2% of the deaths.
Even Pfizer doesn't have the nerve to try to suggest that their experimental injections can stop you dying from causes other than just covid but apparently the ONS is making such claims on their behalf.
Thank you for all of your efforts these past three years to defend children especially from the harms of this vaccine, and to explain why in layman's terms.
"The more recent data seems to have bias such that deaths in the unvaccinated are more likely to be included in the ONS sample whereas deaths in the vaccinated have the opposite bias and are more likely to be excluded from the ONS sample." (Not unlike the lack of true blinding when confirming symptomatic covid in the Pfizer trials per Brook Jackson's testimony.)
So. The layman might wonder if the dead unvaxed were matched, and the dead vaxed were de-matched, to hit a target.
Looking at that final 40-49 yo mortality rate graph,
If one did not trust one's public health authorities to report data honestly, it could look like the authorities set a goal to de-match (ghost) enough unvaxed dead, and match (de-ghost) enough vaxed dead to achieve the target of reporting a relative risk reduction for the vaxed as being HALF as likely to die from covid as unvaxed in 2021.
And that would certainly support a topline public health goal of reducing vaccine hesitancy. And that seems to have been the topline goal for those who decide what is and is not public health.
On the subject of obtaining NHS numbers for those in the census to then get their vax status from NIMS to get to the ONS dataset, this old ONS document from July 2021 gives a few hints on how they might do that perhaps
While that's talking about linking to the 2011 census such as was done for the earlier reports, I think the principles must be the same (?). They say
"Linked on names, date of birth, postcode, etc. Of the 53,483,502 Census records 50,019,451 were linked deterministically, 555,291 additional matches were obtained using probabilistic matching. Total linkage rate: 94.6%"
So most records seem to be linked through the names, month and year of birth and postcode say being identical and a few are done on the basis that on the balance of probabilities the records are the same. It would be interesting to know what the match criteria that ONS use are. Might there be scope for bias in choosing those rules (?). Might an FOI be possible on the information the ONS hold re the rules used by ONS to match census records to NHS numbers through the Personal Demographics Service both deterministic and through probability matching?
I have run 94.6% through the model and it still works.
Because there are so many vaccinated people a 5.4% misclassification rate is enough to create huge anomalies.
Even if the matching criteria were perfectly fair the bias comes from the small chance of a match failure.
They say "The population was restricted to people in England, alive on 1 April 2021 (51,786,812 people). This is 91.6% of the England population on Census Day 2021."
If they had 53,483,502 records in 2011 census that seems like quite a fall in a decade.
They also make no comment on the 4.7 million people with a vaccination record that they did not include!
It might be worth me sharing my thinking on potential ONS dataset biases before I saw your article here on the ghost populations.
The ONS latest report state that they covered 91.6% of the population and 90.5% of all deaths. On the surface that sounds good. It's most of the deaths and population and similar percentage between deaths and population. So it must be fairly representative, or at least that's what ONS want us to believe. But when I looked at the age and vax status breakdown I started to see problems.
ONS stated in the main report that 895,135 deaths occurred between 1/4/2021 and 31/12/2022 and were reported by 4/1/2023. I tried to take out the under 18 deaths and so estimated that this means they are saying about 889,000 18+ deaths occurred between 1/4/2021 and 31/12/2022 and were reported by 4/1/2023.
And table 5 18+ deaths total 883,784 (which is about 99.4% of the 889,000 figure). So I assumed that table 5 deaths were close enough (within 0.6%) to all deaths of those in the 2021 census to be the same thing. That assumption may or may not be faulty but I worked with it all the same.
Noticing that I then thought why not compare the ratio of table 2 to table 5 deaths and compare it to the ratio of table 2 population/census 2021 population. Because 99.4% is close to 100%, the table 2 population/2021 census population ratio should be roughly the same as the table 2 population/table 5 population ratio.
The deaths are whole April 2021 to December 2022 period figures. The monthly death figure percentages which I've looked at but not posted here fluctuate fairly closely around the whole period average. So I've used the whole period here rather than the single month April 2021 percentages. The population percentage shouldn't really change much over the period so it's sensible to use the whole period to compare.
And the populations are at census day for the 2021 census population and the April 2021 average population for table 2 (using the 365/30 method). I've ignored the 4 week date difference in population dates.
Now you start to see the mismatch in the table 2 population by vax status and age, in that a much lower % of unvaxed deaths are included than vaxed deaths especially in the younger age groups.
I started to ask myself questions such as:
If about 70% of unvaxed 2021 census deaths (matched or unmatched to NHS number) are included in table 2 and about 92% of vaxed 2011 census deaths (matched or unmatched to NHS number) and if about 91% of the all status 2021 population is included in the table 2 population, what is the equivalent of the 91% all vax statuses population figure when split by vax status? If that isn't around 70%/92% also then there's a bias. And even if it is 70%/92% will the 30%/8% excluded give the same answers as the 70%/92% included. Hence some connection with your ghost population approach
And more importantly how do these things compare by age band?
And what miscategorisations of vax status deaths, in table 2 and table 5 still exist?
Of course you are using the NIMS population figures (the unvaxed proportion from that is likely to be more accurate than the ONS estimates of course).
It's worth saying that if we unrealistically assume for the purposes of discussion that the ONS sample did include say 70% of unvaxed deaths and 70% of unvaxed population and 92% of vaxed deaths and 92% of vaxed population and these were entirely random samples, we would need to scale up the implied table 2 unvaccinated percentages to compare with the UKHSA weekly NIMS unvaccinated percentages.
I love the idea of rebellious old folk, swerving the system for their whole life, and living longer as a result!
Joel, I don't regard myself as especially old at 69, given that my Dad survived to 99, but I was attending protest marches outside the US Embassy against the bombing of Cambodia when I was a student, (eh-hem, 50 years ago, LOL) and I see no especial reason to stop thinking or protesting nowadays: after all, I have zero to lose!
I don't do politics or controversy on my own substack: but you are welcome to subscribe if you enjoy gentle things like wild birds, Scottish culture, scenery, adventure, and foraging for mushrooms. We all need that too!
Many thanks for this detailed analysis. I must ask a question regarding this point: "The most fundamental question of all is whether any people trying to match identifying details in different datasets were aware of the vaccination status of the person who had died when they did that?"
Taking into account that ONS refused to provide any data for seven months and then returned with data arranged into new categories that do not line up helpfully with the historical data, the most parsimonious explanation appears to be that ONS are intentionally gerrymandering the data. What I still do not quite understand is why they would do this - they are not funded from pharmaceutical money like The British Heart Foundation, who have a fiduciary motive for their deceit, and they cannot protect the government from criticism because the 'opposition' party is even more rabidly pro-magical talismans than the government and wouldn't dream of calling them on this topic.
What motive do the ONS have to be purposefully distorting the presentation of this data? A few candidates are worth mentioning:
1. ONS contains a sufficient number Covidian true believers who are certain the vaccines work and thus consciously or unconsciously skew the data towards the result they expect or want
2. ONS is afraid of causing confidence in vaccination to slip even further than the recent nonsense already did, and is therefore intentionally engaging in a pernicious 'noble lie'
3. There is pressure on the ONS from somewhere else (where...?)
I suspect this is more of a question for the other people loitering in the comments than for you, Dr Craig. However, let me take the opportunity to thank you for everything you have been doing since 2020. It is greatly appreciated!
It is almost impossible to work within the UK establishment and its many quangos unless you 'toe the line'. You see the same trend in the BBC, the Police, the Judiciary, and the Media: all have their snouts in the same trough, which is the money that they extract from us, the hapless taxpayers.
John Dee's part 3 on the ONS data ruminates some on this. I lack the knowledge to succinctly summarize it here, so I recommend checking it out on his substack.
I wish I could fully understand all of this, that's my fault, but I get the gist of it.
Wouldn't it be simple if one had accurate details of the entire UK population, total numbers and total numbers in age bands.
Then, voila - deduct the ever jabbed in each band (mRNA/Astra) ie- the NIMS total?
Remaining balance are the never jabbed.
Add up the deaths in each group.
Even I could maybe understand that.
Thank you Clare and others for all your hard work and no little courage over the last three years or so. Keep going and never forget, there are far more of us than they would have you believe. And, importantly, we have right on our side.
I look forward to your next appearance on the BBC when you can present your findings to a wider audience who might learn something valuable.
I'm sure the invitation is in the post...along with Mike Yeadon's, Carl Heneghan's, Aseem Malhotra's etc.
But that's the problem! There is no such accurate record for any country (except maybe North Korea?!) so any analysis that relies on a population denominator has to be so heavily caveated as to be useless for making any reliable population-wide inferences. No-one told the MSM this though! Or, if they did, they deliberately ignored that because the headline supporting the narrative is all that matters, not that people can make properly informed decisions about their health.
Some arguments for 'better data needed' could play into the hands of those who want compulsory digital IDs. The deadline for the 'consultation' on this is very shortly.
Most countries do have compulsory ID cards. The Common Law and stress on individual rights seem to have slowed the trend here. It's hilarious though if supermarkets know the UK population more accurately, i.e. from food sales, than the Home Office does.
On the whole, I think the existing UK data bases kept on people are perfectly adequate. HMRC deals with taxpayers, DWP deals with state pension recipients, DVLA deals with drivers, etc. If people weren't becoming more alienated from the police and other authorities, more people might fill in the ONS census form. 'The state' has only itself to blame.
U.K. population is much larger than they will openly admit, pressure on the infrastructure and the none stop building tells us it far exceeds the numbers we are fed.
The amnesty announced yesterday by Sunak will have the numbers swell even more by the ones who arrived and have remained hidden. You can expect none of these of over million that came through legal channels just last year will have been vaxxed, in some countries where they’re from it was never pushed as it was in western countries
We are losing people too: many EU citizens are leaving due to Brexit or because their own countries are booming, many UK pensioners want to live in the sun for six months a year, many young Brits are working in California, Florida, Canada, New Zealand or even China.
My daughter has been working in New Zealand for the last decade: I dont suppose she 'deregistered' from the local GP clinic, and we still get bank letters for her dormant account: so she is another 'ghost;' in the system.
Well said - I second that.
Hi Dr. Craig, I think I can shed some light on the cryptic sentence,
“We linked deidentified Census 2021 records to NHS numbers using the personal demographics service to obtain NHS numbers for census identification numbers.”
IT systems of late try to protect Personally Identifiable Information (PII) by keeping it in a single database for the whole enterprise, which is only accessible through a protected service. The protected service ("personal demographics service", perhaps) issues a generated key which other systems can store in their own database, and use to access (but not store) the PII.
So, if someone were to hack the ONS database, they would not get all the personal demographic info as well.
This begs the question (if this is the correct interpretation), why does the ONS claim they cannot release the raw data due to the PII in it? (as if they couldn't simply leave out the PII in the first place).
Thank you for your valuable work.
Thank you. That's really helpful.
That would imply that they must have written rules for matching and there would be little flexibility in terms of matching records with slight differences. Is that a fair interpretation?
Yes, I believe that is most likely true.
I checked around to see if my guesswork above seemed reasonable. The cryptic sentence is indeed about hiding personal identity, but it works differently than I guessed and seems fairly convoluted. (Also, now I can't see how the leap is made from de-identified to identified either :)
Suffice it to say I don't have a complete picture. So it's not really feasible to conjecture on who matched the census data to NHS records, when, or how.
Still, it seems quite likely that the method - whatever it is - is automated, i.e. cast in concrete, in software code. I found several references to automated matching routines/calls to link ONS to NHS data. And the dataset to be matched is huge. So I think matching is likely to be automated, and potentially in a way that is standardized across many ONS statistics teams.
Automation means little flexibility in terms of results.
Even though there may be flexibility and complexity in the matching *method*, if it's automated it should work the same way every time (barring bugs, or AI :).
I will post some detailed notes and links in a separate comment, in case they might be of interest.
Notes on what I found, for the record. Feel free to ignore!
*** NHS Side
The Personal Demographics Service (PDS) belongs to NHS Digital. It's also used by doctors and hospitals to get patient information. It contains personal patient data (including many personal identifiers), but not medical records.
https://digital.nhs.uk/services/demographics
Here are two main PDS APIs (i.e., calls) to get patient data:
1) Get patient details - caller provides the NHS number;
https://digital.nhs.uk/developer/api-catalogue/personal-demographics-service-fhir#api-Default-get-patient
2) Search for a patient - caller provides personal data to match on.
https://digital.nhs.uk/developer/api-catalogue/personal-demographics-service-fhir#api-Default-search-patient
Neither of these would explain our cryptic sentence!
*** ONS Side
The cryptic sentence may be talking about de-identification in the result, rather than in the source. Note the context just before it:
"The Census 2021 linked dataset is based on the population in the Census 2021 ... We linked deidentified Census 2021 records to NHS numbers using the personal demographics service to obtain NHS numbers for census identification numbers."
https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/deathsinvolvingcovid19byvaccinationstatusengland/deathsoccurringbetween1april2021and31december2022
I gather that by "Census 2021 linked dataset", they are referring to their working analysis dataset. In other words, it contains de-identified data extracted from the full Census 2021, linked to NHS number from PDS. I saw this "linked dataset" terminology in many other descriptions. Some had several sources, so it was perhaps a little more clear.
So this could simply be a case of confusing wording. For example it could be trying to describe a process like this:
1) Link each census identification number to an NHS number using NHS Personal Demographic Service (eg, "search"), and passing identifying criteria from the full Census 2021 dataset.
2) Stuff the matches away in a temporary result.
3) De-identify the Census 2021 data, join it to our temporary result with NHS number, and extract it all out as our analysis dataset.
Or it could be describing a very complicated architecture with all kinds of components that I would rather not imagine. This kind of approach is hinted at on this older web page:
"We are supplied PDS data as an extract from the PDS system"
https://www.ons.gov.uk/aboutus/whatwedo/programmesandprojects/censusanddatacollectiontransformationprogramme/futureofpopulationandsocialstatistics/datasourceoverviews/personaldemographicsservicedata
In the first case, NHS# would likely be linked at the time the death by vaccination status dataset is created. In the second case, NHS# might have been linked back when the full 2021 Census data was loaded (and perhaps on an ongoing basis as well).
Either way, here is the general matching policy for linked datasets, which may be of interest:
Data linkage and matching policy - https://www.ons.gov.uk/aboutus/transparencyandgovernance/datastrategy/datapolicies/datalinkageandmatchingpolicy
Other relevant policies:
"Datasets used for statistical research are usually de-identified. The approach taken by the ONS for requests to include personal information in a research project is covered in the "Statistical research using personal information" section of this policy.
There is a clear separation of duties meaning that staff handling personal information for data processing (for example, ingestion, data engineering) are not actively involved in statistical research. ONS staff conducting statistical research do not have access to personal information."
De-identification policy - https://www.ons.gov.uk/aboutus/transparencyandgovernance/datastrategy/managingidentifiabledata
"As part of the ONS's commitment to the highest ethical standards, we are taking a leading role in safeguarding confidentiality by ensuring the efficacy and proportionality of controls applied to de-identify data."
"When working with de-identified data is feasible, a proportionate level of de-identification ensures that ONS staff and accredited researchers can access the data they require without directly or indirectly disclosing the identity of individuals (re-identification)."
Managing identifiable data - https://www.ons.gov.uk/aboutus/transparencyandgovernance/datastrategy/datapolicies/deidentificationpolicy
*** Other context
1) Example description of another linked dataset from the ONS website, a little clearer I think:
"Age-standardised mortality rates are calculated for vaccination status groups using the Public Health Data Asset (PHDA) dataset. The PHDA is a linked dataset combining the 2011 Census, the General Practice Extraction Service (GPES) data for pandemic planning and research, and the Hospital Episode Statistics (HES). We linked vaccination data from the National Immunisation Management Service (NIMS) to the PHDA based on NHS number, and linked data on positive coronavirus (COVID-19) Polymerase Chain Reaction (PCR) tests from Test and Trace to the PHDA, also based on NHS number."
Weekly COVID-19 age-standardised mortality rates by vaccination status, England - https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/methodologies/weeklycovid19agestandardisedmortalityratesbyvaccinationstatusenglandmethodology
(The link is actually given in the *Deaths by vaccination status Dec 2022* spreadsheet - on the Notes tab, cell [B4]. Might be a mistake though, doesn't seem to fit.)
2) Proposed ONS dataset to roll Census 2021 data forward into the future
Interesting for context, terminology, and ambition.
https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/internationalmigration/methodologies/thecensus2021dataassetlongitudinaldatasourceforpopulationinenglandandwalesdesignandplans
3) ONS description of the PDS:
"The PDS system is the master demographics database for the NHS in England, Wales and the Isle of Man. It is the primary source of information on a patient's NHS number, name, address and date of birth. It does not hold any clinical information. The master database contains approximately 74 million patient records. Records are created for newborns or when a patient makes contact with an NHS service, primarily by registering with a General Practitioner (GP) practice, but also through accessing A&E or attending hospital. The PDS is used by NHS organisations and enables a patient to be readily identified by a healthcare professional to quickly and accurately obtain their correct medical details."
https://www.ons.gov.uk/aboutus/whatwedo/programmesandprojects/censusanddatacollectiontransformationprogramme/futureofpopulationandsocialstatistics/datasourceoverviews/personaldemographicsservicedata
Thank you.
Your points about automated matching got me thinking about what else could account for the anomalies and I ended up writing this: https://drclarecraig.substack.com/p/the-ons-have-a-faulty-sorting-hat
That could certainly account for some of the wild deviations in rates.
It seems if they ever did give out "the raw data", it would already have all of these mismatch issues built in. In a sense there is no "raw data", because they are cobbling it together from disparate sources, and the result will already include a lot of assumptions about matching, errors, etc.
Release the raw data - not just some of it ; all of it
Thank you for all you do, and the clarity. I'm delighted that you have spent the time to detail how one can become a ghost, preferably without dying! Love it! 👏👏👏
Great review. The data is very weak by design or incompetence. Perhaps the greater question that isn't being asked is "Why are the data systems universally so bad?" Money certainly wasn't an issue. Didn't Fauci have 40 years to plan Covid data systems? Didn't Fauci have a year to put in place a state of the art vaccination total tracking system? Weren't there many many Pandemic stimulations that could have developed pandemic data systems? Many funded by a guy name Gates which I believe had something to do with data systems. Weren't the Masters of the Data Universe (Social, Big Tech, Deep State Tech) all on board for controlling covid data? Isn't data systems their baby? Didn't Google come in and save Obamacare's website? Where were they with the Covid data?
Why are the Covid data systems so bad when they had so much time, money, technical ability, and global planning to make them right? That is the question.
Again great review. I figure just like a ruler can't measure itself, people can't always either. One good metric of who you are is your friends. If you have great friends, you must measure up. In looking at who likes your work, you have really good group of professional admirers. Keep up the good work.
Surely we/they should establish a baseline pre-vaccines of the general health (i.e. ER visits, disease records etc.) of the two groups, i.e. person who did vs. did NOT get at least one covid vaccine?
Matthew Crawford has written extensively about this and it is very clear from looking at flu vaccine that the healthy vaccinee effect could explain the totality of positive effects observed.
Won’t do or show that, it would reveal facts and truth if they did.
If there wasn’t anything to hide, then they wouldn’t be going to such extreme lengths to cloudy the water
Wow. This is much more complicated than I imagined.
Another couple of reasons to distrust the data. Firstly, the graphs don't seem to exhibit the normal seasonal pattern of mortality. Secondly, the death rates tail off in 2022, particularly at the younger ages and this may be because of some delays in notification. John Dee wrote about this in his cause of death analysis.
Suppose the ONS has received a steer from above to avoid producing numbers that make the vaccines look bad. One way to do that is to produce a range of draft "findings" based on a range of alternative but plausible ways to do the calculations. Then from this range, the draft findings that make the vaccines look best could be selected, and rationalisation for the way the calculations were done put in place.
Peter Watt they’ve made it clear as mud, if there was any real benefit they roll out those charts like in he early days of rollout and bombard the public with the information.
Their silence speaks volumes since it was dropped in favour of Ukraine a year ago.
The people in the government who are getting protection are not just the elected polititians but the civil servants, nudge unit members and hospital officials ... the people who lied, and the people who gave the orders to lie.
Thank you for this.
1) Please keep reminding us of the definition of "vaccinated" which has a two or three week lag from the date of injection, or a few weeks if you require two shots to be defined as "vaccinated."
2) In presenting your graphs for the older cohorts where the "unvaxxed" have higher mortality rates than the "vaxxed," it is helpful to recall that those who are about to die are usually excused from the Covid injections. Otherwise the Vaxx-fans use them to "prove" that we need to inject the elderly.
Many of the "unjabbed" per THEIR criteria actually received jabs at some point.
Purebloods uber alles.
Brilliant work, thank you. Intentional or unintentional, this bias makes the data useless. Not that that will ever be noticed by the media and official channels, of course. Which makes it all very convenient I'm not a believer in such coincidences.
I'm also wondering do we still need to be concerned regarding the definition of "unvaccinated" wrt mortality? We know that those vaccinated but in the window of (variously) 10, 21 and 30 days after inoculation were in many official data sets reported as "unvaccinated" (Like being only "a little bit pregnant"?)
There is little doubt that the ONS have been skewing their reports on deaths with regards to covid and impact of the pseudo vaccines on all cause mortality. As Clare points out it gets so absurd that if these data are to be accepted then the covid vaccines apparently stop people dying from causes other than covid.
The ONS report Clare refers to includes a Table 4 which lists mortality rates by age group and vaccination status. The month of March 2022 is the newest data. This has a section for Non-COVID 19 Deaths.
It is peculiar to say the least that for 40-49's the share of the population (based on the people-years proportions for the whole age group) is 10.4% for the unvaccinated but the share of deaths is 13.4%
A similar oddity exists for the 50-59s where the figures are 6.2% of the population with 9.3% of the deaths. For the 60-69s it's 4.1% of the population and 6.2% of the deaths.
Even Pfizer doesn't have the nerve to try to suggest that their experimental injections can stop you dying from causes other than just covid but apparently the ONS is making such claims on their behalf.
Thank you for all of your efforts these past three years to defend children especially from the harms of this vaccine, and to explain why in layman's terms.
"The more recent data seems to have bias such that deaths in the unvaccinated are more likely to be included in the ONS sample whereas deaths in the vaccinated have the opposite bias and are more likely to be excluded from the ONS sample." (Not unlike the lack of true blinding when confirming symptomatic covid in the Pfizer trials per Brook Jackson's testimony.)
So. The layman might wonder if the dead unvaxed were matched, and the dead vaxed were de-matched, to hit a target.
Looking at that final 40-49 yo mortality rate graph,
If one did not trust one's public health authorities to report data honestly, it could look like the authorities set a goal to de-match (ghost) enough unvaxed dead, and match (de-ghost) enough vaxed dead to achieve the target of reporting a relative risk reduction for the vaxed as being HALF as likely to die from covid as unvaxed in 2021.
And that would certainly support a topline public health goal of reducing vaccine hesitancy. And that seems to have been the topline goal for those who decide what is and is not public health.
On the subject of obtaining NHS numbers for those in the census to then get their vax status from NIMS to get to the ONS dataset, this old ONS document from July 2021 gives a few hints on how they might do that perhaps
https://dam.ukdataservice.ac.uk/media/622972/covidtinsleynafilyan.pdf
While that's talking about linking to the 2011 census such as was done for the earlier reports, I think the principles must be the same (?). They say
"Linked on names, date of birth, postcode, etc. Of the 53,483,502 Census records 50,019,451 were linked deterministically, 555,291 additional matches were obtained using probabilistic matching. Total linkage rate: 94.6%"
So most records seem to be linked through the names, month and year of birth and postcode say being identical and a few are done on the basis that on the balance of probabilities the records are the same. It would be interesting to know what the match criteria that ONS use are. Might there be scope for bias in choosing those rules (?). Might an FOI be possible on the information the ONS hold re the rules used by ONS to match census records to NHS numbers through the Personal Demographics Service both deterministic and through probability matching?
Thank you so much. That's really helpful.
I have run 94.6% through the model and it still works.
Because there are so many vaccinated people a 5.4% misclassification rate is enough to create huge anomalies.
Even if the matching criteria were perfectly fair the bias comes from the small chance of a match failure.
They say "The population was restricted to people in England, alive on 1 April 2021 (51,786,812 people). This is 91.6% of the England population on Census Day 2021."
If they had 53,483,502 records in 2011 census that seems like quite a fall in a decade.
They also make no comment on the 4.7 million people with a vaccination record that they did not include!
https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/deathsinvolvingcovid19byvaccinationstatusengland/deathsoccurringbetween1april2021and31december2022
Thanks.
It might be worth me sharing my thinking on potential ONS dataset biases before I saw your article here on the ghost populations.
The ONS latest report state that they covered 91.6% of the population and 90.5% of all deaths. On the surface that sounds good. It's most of the deaths and population and similar percentage between deaths and population. So it must be fairly representative, or at least that's what ONS want us to believe. But when I looked at the age and vax status breakdown I started to see problems.
ONS stated in the main report that 895,135 deaths occurred between 1/4/2021 and 31/12/2022 and were reported by 4/1/2023. I tried to take out the under 18 deaths and so estimated that this means they are saying about 889,000 18+ deaths occurred between 1/4/2021 and 31/12/2022 and were reported by 4/1/2023.
And table 5 18+ deaths total 883,784 (which is about 99.4% of the 889,000 figure). So I assumed that table 5 deaths were close enough (within 0.6%) to all deaths of those in the 2021 census to be the same thing. That assumption may or may not be faulty but I worked with it all the same.
Noticing that I then thought why not compare the ratio of table 2 to table 5 deaths and compare it to the ratio of table 2 population/census 2021 population. Because 99.4% is close to 100%, the table 2 population/2021 census population ratio should be roughly the same as the table 2 population/table 5 population ratio.
This is the chart that results.
https://ibb.co/bBTR5gd
The deaths are whole April 2021 to December 2022 period figures. The monthly death figure percentages which I've looked at but not posted here fluctuate fairly closely around the whole period average. So I've used the whole period here rather than the single month April 2021 percentages. The population percentage shouldn't really change much over the period so it's sensible to use the whole period to compare.
And the populations are at census day for the 2021 census population and the April 2021 average population for table 2 (using the 365/30 method). I've ignored the 4 week date difference in population dates.
Now you start to see the mismatch in the table 2 population by vax status and age, in that a much lower % of unvaxed deaths are included than vaxed deaths especially in the younger age groups.
I started to ask myself questions such as:
If about 70% of unvaxed 2021 census deaths (matched or unmatched to NHS number) are included in table 2 and about 92% of vaxed 2011 census deaths (matched or unmatched to NHS number) and if about 91% of the all status 2021 population is included in the table 2 population, what is the equivalent of the 91% all vax statuses population figure when split by vax status? If that isn't around 70%/92% also then there's a bias. And even if it is 70%/92% will the 30%/8% excluded give the same answers as the 70%/92% included. Hence some connection with your ghost population approach
And more importantly how do these things compare by age band?
And what miscategorisations of vax status deaths, in table 2 and table 5 still exist?
Of course you are using the NIMS population figures (the unvaxed proportion from that is likely to be more accurate than the ONS estimates of course).
It's worth saying that if we unrealistically assume for the purposes of discussion that the ONS sample did include say 70% of unvaxed deaths and 70% of unvaxed population and 92% of vaxed deaths and 92% of vaxed population and these were entirely random samples, we would need to scale up the implied table 2 unvaccinated percentages to compare with the UKHSA weekly NIMS unvaccinated percentages.