Covid

Bland-Altman Plot

Bland-Altman analysis is used to study the agreement between two measurements. Here is how it is created.

Step 1: Collect the two measurements

Sample_Data <- data.frame(A=c(6, 5, 3, 5, 6, 6, 5, 4, 7, 8, 9,
10, 11, 13, 10, 4, 15, 8, 22, 5), B=c(5, 4, 3, 5, 5, 6, 8, 6, 4, 7, 7, 11, 13, 5, 10, 11, 14, 8, 9, 4))

Step 2: Calculate the means of the measurement 1 and measurement 2

Sample_Data$average <- rowMeans(Sample_Data) 

Step 3: Calculate the difference between measurement 1 and measurement 2

Sample_Data$difference <- Sample_Data$A - Sample_Data$B

Step 4: Calculate the limits of the agreement based on the chosen confidence interval

mean_difference <- mean(Sample_Data$difference)
lower_limit <- mean_difference - 1.96*sd( Sample_Data$difference )
upper_limit <- mean_difference + 1.96*sd( Sample_Data$difference )

Step 5: Create a scatter plot with the mean on the X-axis and the difference on the Y-axis. Mark the limits and the mean of difference.

ggplot(Sample_Data, aes(x = average, y = difference)) +
  geom_point(size=3) +
  geom_hline(yintercept = mean_difference, color= "red", lwd=1.5) +
  geom_hline(yintercept = lower_limit, color = "green", lwd=1.5) +
  geom_hline(yintercept = upper_limit, color = "green", lwd=1.5) +
  ggtitle("")+ 
       ylab("Difference")+
       xlab("Average") 

Bland-Altman Plot Read More »

Accuracy and Asymmetry

Let’s develop a simple prediction technique to identify the sex of a person based on height. Here is data from 1050 participants and has the following form.

The first step is to plot them and check their distributions.

A naive way to set up the prediction is to assign everyone with height > 64 inches as male.

y_hat <- ifelse(heights$height > 64, "Male", "Female") 
mean(heights$sex == y_hat)

The answer is an impressive 83%

But how well did it predict individually?

mean(yy[heights$sex == "Male"] == y_hat[heights$sex == "Male"])
mean(yy[heights$sex == "Female"] == y_hat[heights$sex == "Female"])

For males, the accuracy is about 94% and for females, it’s only 44%. The discrepancy prompts us to look at the respective number of samples in the set.

length(heights$sex[heights$sex == "Female"])
length(heights$sex[heights$sex == "Male"])
Females are 238, and males are 812.

Accuracy and Asymmetry Read More »

News From Huanan Market

After a brief interval, here is some Covid news. A new peer-reviewed article is now available in Nature for preview. The study summarises the RNA sequence results from several samples from Huanan Seafood Market in Wuhan. The market was linked to several of the early cases of the illness. Since the market’s closure (1st of January 2020), 923 environmental and 457 animal samples were collected from 1-Jan to 2-Mar 2020. Here is the high-level summary:

# Samples# +ve by
RT-PCR
Huanan Seafood Market71840
Warehouses145
Other markets301
Drainage11024
Sewerage wells513
Total92373
Summary of environmental sample results

Notably, 35 samples from February showed positive, suggesting a pretty long persistence of the viral material in the environment.

Of the 457 samples collected from animals belonging to 18 species, none of them tested positive for the virus.

While several samples had genetic material belonging to mammals of genera such as homo (e.g. human), ovis (e.g. sheep), bos (e.g. cow), canis (e.g. dog) etc., it is not, however, proof that these animals were infected but may only mean that there was an increased focus (for sample collection) on those shops and locations, where animals were sold. The same goes with the case of racoon dogs as carriers: the study found genetic material from those; it could only mean that two things (virus-carrying entity and racoon dogs) co-existed, and nothing further.

Reference

Surveillance of SARS-CoV-2 at the Huanan Seafood Market: Nature

News From Huanan Market Read More »

The behavioural immune system

It is a term introduced by the psychological scientist Mark Schaller, describing mechanisms devised by animals, including humans, to counter microbes that cause infection. A simple example is the repulsion towards rotten food.

The behaviour immune system may be considered complementary to the body’s immunological defence. The latter consumes energy and is reactive; the pathogens first enter, and then the body produces compounds (e.g. antibodies) to counter. But a repulsive smell or taste prevents some from consuming it in the first place.

References

Mark Schaller, Phil. Trans. R. Soc. B (2011) 366, 3418–3426

Behavioural immune system: Wiki

The behavioural immune system Read More »

Why Most Published Results are Wrong

It is the title of a famous analysis paper published by Ioannidis in 2005. While the article goes a bit deeper in its commentary, we check the basic understanding behind the claim – through Bayesian thinking.

Positive predictive value, the ability of analysis to predict the positive outcome correctly, is the posterior probability of an event based on prior knowledge and the likelihood. The definition of PPV in the language of Bayes’ theorem is,

P(T|C_T) = \frac{P(C_T|T) P(T) }{P(C_T|T) P(T) + P(C_T|nT) P(nT)}

P(T|CT) – The probability that the hypothesis is true given it is claimed to be true (in a publication)
P(CT|T) – The probability that the claim is true given it is true (true hypothesis proven correct)
P(T) – The prior probability of a true hypothesis
P(CT|nT) – The probability that the claim is true given it is not true (false hypothesis not rejected = 1 – false hypothesis rejected)
P(nT) – The prior probability of an incorrect hypothesis (1 – P(T))

Deluge of data

The last few years have seen an exponential growth of correlations due to a flurry of information and technology breakthroughs. For example, the US government issues data of ca. 45000 economic statistics and an imaginative economist can find out several millions of correlations among those, most of which are just wrong. In other words, the proportion of causal relationships in these millions of correlations is declining with more data. In the language of our equation, the prior (P(T)) drops.

Suppose the researcher can rightly identify a true hypothesis 80% of the time (which is quite impressive) and rightly reject an incorrect one at 90% accuracy. Yet, the overall success, PPV, is only 47% if the prior probability of a true relationship is only 1 in 10.

P(T|C_T) = \frac{0.8 * 0.1}{0.8 * 0.1 + 0.1 * 0.9} = 0.47

References

Why Most Published Research Findings Are False: John P. A. Ioannidis; PLoS Medicine, 2005, 2(8)

The Signal and the Noise: Nate Silver

Why Most Published Results are Wrong Read More »

Life Expectancy in Covid Times

The story of Covid-19 is getting three years old, and we are still getting the magnitude of the calamity it caused. One way of studying the impact is by mapping the change in life expectancy during the pandemic. Nature human behaviour has just published a paper on this topic, summarising data collected from 29 countries.

Understanding life expectancy

The calculations, known as the period life expectancies, are not a prediction but an estimate of how long a newborn will live if today’s death rate persists for her entire life. So these numbers will vary between 2020, when it was severe cases of Covid-19 with no vaccinations available, to 2021, where there were some mitigations available, to 2022, where the deadliness was relatively lower.

LE deficit

The researchers covered the change in LE rates of countries, that included Europe, the USA and Chile, since 2019, using data on all-cause mortality. They also define a term, LE deficit, which is the difference between the observed LE, and expected LE based on pre-pandemic estimates. Consider this: a country estimates an LE of 80 years in the last quarter of 2021. Imagine it was 79 years in 2015 and was slowly progressing upward (based on the trends from the past few years), and the expectation by Q4 2021 was 82. Then the LE deficit is 80 – 80 = 2 years.

There are a bunch of findings worth mentioning here:
Of the countries under investigation, only Finland, Norway and Denmark did not see a decline in LE (in comparison with 2019) in 2020.
Many Western European countries bounced back in 2021, i.e. positive change LE from 2020 to 2021, whereas most of Eastern European, the USA and Chile continued the fall.

One impressive trend was the correlation between vaccination coverage and life expectancy deficit.

Life expectancy changes since COVID-19: Nature Human Behaviour

Life Expectancy in Covid Times Read More »

Covid and Smoking

A paper was published in April 2020 on the open science platform, Qeios. The topic was the potential benefit of tobacco smoking to protect against Covid-19.

The conclusions in the article were based on data from observational studies and not randomised clinical trials. We have already discovered issues which arise from observational studies, collider bias being one of them.

Collider bias happens when two variables, e.g., risk factor and outcome, influence a third, namely, the likelihood of being sampled. In our case, the sampling occurred on or before April 2020, in the earlier part of the pandemic. As you may recall, testing was in the developing stages, and the focus was on front-line health workers and patients with severe symptoms. In technical terms, the sample was not random or representative.

Therefore, the data space has narrowed down to health workers, and within those, there are smokers and non-smokers. As a consequence of the testing strategy, the survey censored out the smokers who had no symptoms. And this exaggerated proportion of non-smokers who had symptoms in the sample.

References

Low incidence of daily active tobacco smoking in patients with symptomatic COVID-19: Qeios, CC-BY 4.0 · Article, April 21, 2020

Collider bias undermines our understanding of COVID-19 disease risk and severity: Nature Communications, 2020, 11:5749

Randomised Controlled Trials: BMJ

Covid and Smoking Read More »

Based on a Lancet Study …

In this post, we talk about an article that otherwise requires no special mention in this space. Yet, we discuss it today, perhaps as an illustration of 1) the diverse objectives that scientific researchers set for their work and 2) how the ever imaginative media, and subsequently the public, could interpret the massages. Before we examine the motivation or the results, we need to understand something about the publication status of the study.

Preprints with The Lancet 

It is a non-peer-reviewed work or preprint and, therefore, is not a published article in the Lancet, at least for now. The SSRN page, the repository at which it appeared, further states that it was not even necessarily under review with a Lancet journal. So, a preprint with The Lancet is not equivalent to a publication by the Lancet.

The motivation

You may read it from the title; Randomised clinical trials of COVID-19 vaccines: do adenovirus-vector vaccines have beneficial non-specific effects? It is a review paper, and the investigators specifically wanted to understand the impact of Covid-19 vaccines on non-covid diseases, which, I think, is a valid reason for the research. By the way, you have every right to ask why Covid-19 vaccines should impact accidents and suicides!

Motivated YouTubers

The following line from the abstract turned out to be the key attraction for the YouTuber scientist. It reads: “For overall mortality, with 74,193 participants and 61 deaths (mRNA:31; placebo:30), the relative risk (RR) for the two mRNA vaccines compared with placebo was 1.03“. Now, ignore the first three words, “For overall mortality”, add The Lancet, and you get a good title and guaranteed clicks! 

The results

First, the results from mRNA vaccines (Pfizer and Moderna):

Cause of
death
Death/total
Vaccine group
Death/total
Placebo group
Relative
Risk (RR)
Overall mortality31/3711030/370831.03
Covid-19 mortality2/371105/370830.4
CVD mortality16/3711011/370831.45
Other non-Covid-19
mortality
11/3711012/370830.92
Accidents2/371102/370831.00
Non-accidents,
Non-Covid-19
27/3711023/370831.17

In my opinion, the key messages from the table are:
1) The number of deaths due to Covid-19 is too small to make any meaningful inference
2) The deaths due to other causes show no clear trends upon vaccination

Results from adenovirus-vector vaccines (several studies combined):

Cause of
death
Death/total
Vaccine group
Death/total
Placebo group
Relative
Risk (RR)
Overall mortality16/7213830/500261.03
Covid-19 mortality2/721388/500260.4
CVD mortality0/721385/500261.45
Other non-Covid-19
mortality
8/7213811/500260.92
Accidents6/721386/500261.00
Non-accidents,
Non-Covid-19
8/7213816/500261.17

My messages are:
Accidental accumulation of non-Covid-19-related deaths (five of them coming from cardiovascular) gives an edge to the vaccine group and, therefore, “saves” people immunised with Adenovirus-vector vaccines from dying from other causes, including accidents, in some countries! The statistical significance of the number of cases is dubious.

Lessons learned

1) Be extremely careful before accepting commentaries about scientific work (including this post)
2) As much as possible, find out and read the original paper after being enlightened by YouTube teachers.

Randomised clinical trials of COVID-19 vaccines: do adenovirus-vector vaccines have beneficial non-specific effects?: Benn et al.

Based on a Lancet Study … Read More »

Risks vs Benefit – mRNA Against CoVid-19

You may read this post as the continuation of the one I made last year. Evaluate the risk caused by an action by comparing it with situations without that action. That is the core of the risk-benefit trade-off in decision-making. A third factor is missing in the equation, namely, the cost.

A new study published in The Lancet is the basis for this post. The report compiles the incidents of myocarditis and pericarditis, two well-known side effects linked to the mRNA vaccines against COVID-19. The data covered four health claim databases in the US and more than 15 million individuals.

The results

First, the overall summary: the data from four Data Partners (DP) indicate 411 events out of the 15 million studied who received the vaccine. Details of what is provided by each of the DPs are,

Data Partner
(DP)
Total vaccinatedTotal Observed
myocarditis or
pericarditis
events (O)
Expected
events (E)
(based on 2019)
O/E
DP16,245,406154N/A
DP22,169,3986424.96 2.56
DP33,573,0979440.08 2.35
DP43,160,4689944.612.22

I don’t think you will demand a chi-squared test to get convinced that the two mRNA vaccines have an adverse effect on heart health. Age-wise split of the data gives further insights into the story.

Age-groupObserved EventsTotal vaccinatedIncident Rate
(per 100,000)
Expected Rate
(per 100,000)
18-25153 1,972,410 7.760.99
26-3562 2,587,814 2.40 0.95
36-4563 3,226,022 1.951.11
46-5562 3,597,292 1.721.3
56-64713,764,8311.891.63

The relative risk is much higher for younger – 18 to 35 – age groups. But the absolute risk of the event is still in the single digits per hundred thousand. And this is where we should look at the risk-benefit-cost trade-off of decision-making.

The risk

First and foremost, don’t assume all those 411 individuals died from myocarditis or pericarditis; > 99% recover. To know that, you need to read another study published in December 2021 that reported the total number of deaths to just 8! So, there is a risk, but the absolute value is low. The awareness of the risk should alert the recipients that any discomfort after the vaccination warrants a medical checkup.

The benefit

It would be a crime to forget the unimaginable calamity that disease has brought to the US, with more than a million people dying from it. A significant portion of those deaths happened prior to the introduction of the vaccines, and even after, the casualties were disproportionately harder on the unvaccinated vs the vaccinated.

The cost

At least, in this case, the cost is a non-factor. Vaccine price, be it one dollar or 10 dollars, is way lower than the cost of the alternate choices, buying medicines, hospitalisation or death.

Managing trade-off

Different countries manage this trade-off differently. Since the risk of complications due to COVID-19 is much lower for children and the youth, some allocate a lower priority to the younger age groups or assign a different vaccine. However, it is recognised that avoiding their vaccination altogether, due to their low-risk status, is also not an answer to the problem. It can elevate the prevalence of illness in the system and jeopardise the elders with extra exposure to the virus.

References

Risk of myocarditis and pericarditis after the COVID-19 mRNA vaccination in the USA: The Lancet

Myocarditis after COVID-19 mRNA vaccination: Nature Reviews Cardiology

How to Compare COVID Deaths for Vaccinated and Unvaccinated People: Scientific American

Risks vs Benefit – mRNA Against CoVid-19 Read More »

Covid 19 Excess Death – 2

We have already seen how the excess death rates (deaths per 100,000 population) due to covid distributed. The 25th percentile stands at 130 and 75th at 423 (as of 31st December 2021). The statistics of death rates is represented using a box plot.

The global distribution of excess death is sketched below:

Case of missed opportunity?

With all the support from hindsight knowledge, let us explore how much of these deaths could have been avoided (perhaps in the next pandemic!). Start with the top performers (the countries in green). These are true outliers and let us not fancy replicating their model. Australia, Newzealand, China, Singapore, Brunei are countries that opted for zero-covid policies, at least until a significant portion of their population received vaccines. They have closed down the countries and regions for the entire 2020 and the majority of 2021.

Bolivia tops the list in terms of excess deaths per million population, at 1376. The numbers have been bad from the beginning, and inadequate restriction measures, thanks to the chaotic political establishment, after the ouster of then-president, Evo Morales, did not help its course. Even today, Bolivia is far behind in vaccination rates.

While the exact reasons why Bulgaria is second in the global death charts is not known, I suppose it was not a coincidence that the country was the least vaccinated in Europe – just 27% by December 2021. For Peru, for instance, the story was poverty, lack of medical supplies, and oxygen. Delta variant and slow vaccinations are cited as the major reasons for the death toll in Russia.

The magenta counties

Brazil may be the model case of what not to do in a pandemic. The pandemic response was lax, and most of the deaths had happened in the first two waves, before the large-scale vaccination programs.

The US is an intriguing example. On the one hand, one can argue that the death rate of around 300 per 100,000 is the limit of what this disease can do with moderate barriers of disease control and a reasonable rate of vaccination. But the question will remain why the country can’t do what its neighbour Canada had managed (115). Spain too belongs to this category and is one of the countries that got battered in the first wave. The reason: no real preparedness as one of the earliest countries (after China and Italy) to hit the virus.

Final word for the country that topped the list of excess death – with about 4 million! India started with one of the most stringent covid measures in the world (shut down of March-May 2020). The country could not cope with the tides of the two waves, one starting from June 2020 and then the delta of 2021, with decent vaccination levels were so far away.

Reference

Estimating excess mortality due to the COVID-19 pandemic: a systematic analysis of COVID-19-related mortality, 2020–21: The Lancet

Covid Response: Bolivia

Covid Response: Peru

Covid Case: Spain

Covid 19 Excess Death – 2 Read More »