Data & Statistics

Tukey’s Method Continued

July 5, 2022

Here are the sampling results of a product from four suppliers, A, B, C and D (Data courtesy: https://statisticsbyjim.com/).

A	B	C	D
40	37.9	36	38
36.9	26.2	39.4	40.8
33.4	24.9	36.3	45.9
42.3	30.3	29.5	40.4
39.1	32.6	34.9	39.9
34.7	37.5	39.8	41.4

Hypotheses

N₀ – All means are equal
N_A – Not all means are equal

Input the data

PO_data <- read.csv("./Anova_Tukey.csv")
as_tibble(PO_data)

Leads to the output (first ten entries)

Material Strength
<chr>     <dbl>
B	  37.9			
C	  36.0			
D	  38.0			
A	  40.0			
A	  36.9			
C	  39.4			
A	  33.4			
B	  26.2			
B	  24.9			
B	  30.3

Plot the data

par(bg = "antiquewhite")
colors = c("red","blue","green", "yellow")
boxplot(PO_data$Strength ~ factor(PO_data$Material), xlab = "Supplier", ylab = "Material Data", col = colors)

F-test for ANOVA

str.aov <- aov(Strength ~ factor(Material), data = PO_data)
summary(str.aov)

Output:

                 Df Sum Sq Mean Sq F value Pr(>F)   
factor(Material)  3  281.7    93.9   6.018 0.0043 **
Residuals        20  312.1    15.6                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Reject null hypothesis

We at least reject the null hypothesis because the p-value < 0.05 (the chosen significance level). The F-value is 6.018. Another way of coming the conclusions is to find out the critical F value for the degrees of freedoms, df1 = 3 and df2 = 20.

qf(0.05, 3,20, lower.tail=FALSE)
pf(6.018, 3, 20, lower.tail = FALSE)

Lead to

3.098391      # F-critical
0.004296141   # p-value for F = 6.018

Tukey’s test for multiple comparisons of means

TukeyHSD(str.aov)

Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Strength ~ factor(Material), data = PO_data)

$`factor(Material)`
         diff        lwr        upr     p adj
B-A -6.166667 -12.549922  0.2165887 0.0606073
C-A -1.750000  -8.133255  4.6332553 0.8681473
D-A  3.333333  -3.049922  9.7165887 0.4778932
C-B  4.416667  -1.966589 10.7999220 0.2449843
D-B  9.500000   3.116745 15.8832553 0.0024804
D-C  5.083333  -1.299922 11.4665887 0.1495298

Interpreting pair-wise differences

You can see that the D-B difference, 9.5, is statistically significant at an adjusted p-value of 0.0022. And, as expected, the 95% confidence interval for D-B doesn’t include 0 (no difference between D and B).

By the way, the message, the difference between blue box and yellow, was already apparent had you paid attention to the box plots we made in the beginning.

Reference

Hypothesis Testing: An Intuitive Guide: Jim Frost

Tukey’s Method Continued Read More »

Tukey’s Method: Who Made the Difference?

July 4, 2022

In the previous ANOVA exercises, we found that data suggested rejecting the null hypothesis. To remind you of the two hypotheses,
N₀ – All means are equal
N_A – Not all means are equal
So, we at least rejected the proposition that all means are equal because the p-value was lower than the chosen significance level of 0.05 (or the F value was outside the critical F value corresponding to the 0.05 level). But we have no idea which of the pairs of means had the most significant difference.

Tukey method can create confidence intervals for all pair-wise differences while controlling the family error rate to whatever we specify.

Family error rate

You know what is an error rate. It is the probability that the null hypothesis is correct when you reject it when the p-value is less than the significance level. At a significance level is 0.05, there is a 5% chance of getting your outcome when the null hypothesis is correct. The situation is called a false positive.

The p-value we obtained for the material testing problem was 0.03, but it was for the entire family of four vendor groups (each with ten samples). This is the experiment-wise or family-wise error rate. Since our significance level for the F-test was kept at 0.05, we can regard the family error rate to be 5%.

Four groups, six comparisons

Since we had four groups (factors) of samples, each representing one vendor, we have six possible comparisons. They are:

#	Comparison
1	Vendor 2-Vendor 1
2	Vendor 3-Vendor 1
3	Vendor 4-Vendor 1
4	Vendor 3-Vendor 2
5	Vendor 4-Vendor 2
6	Vendor 4-Vendor 3

Family-wise error rate, 0.05, is the grand union of all pair-wise error rates. If the pair-wise error is alpha, family-wise error = (1 – (1-alpha)^C), where C is the number of comparisons. If you substitute alpha = 0.05 and C = 4, you get the family-wise error as 0.26. Obviously, 26% is too high a significance level.

Tukey method preserves the family-wise error rate to what we specify, say, 0.05, and therefore the pair-wise error rates could be about 0.0085.

By keeping all these points in mind, let’s perform the Tukey’s method on our dataset using R.

TukeyHSD(res.aov)

Which leads to the following output:

Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Strength ~ factor(Sample), data = AN_data)

$`factor(Sample)`
                         diff        lwr       upr     p adj
Vendor 2-Vendor 1 -2.26479763 -4.7917948 0.2621995 0.0924842
Vendor 3-Vendor 1 -0.51997359 -3.0469707 2.0070236 0.9448076
Vendor 4-Vendor 1 -2.36456760 -4.8915647 0.1624295 0.0736423
Vendor 3-Vendor 2  1.74482404 -0.7821731 4.2718212 0.2632257
Vendor 4-Vendor 2 -0.09976996 -2.6267671 2.4272272 0.9995613
Vendor 4-Vendor 3 -1.84459400 -4.3715911 0.6824031 0.2197059

In the next post, we will do a complete exercise of ANOVA including the Post Hoc test.

Hypothesis Testing: An Intuitive Guide: Jim Frost

Tukey’s Method: Who Made the Difference? Read More »

F – Statistics

July 3, 2022

We have seen how F-statistics work to test the hypotheses in one-way ANOVA. We also know the definition of F as a ratio between two variances. Variances measure how spread the data is around the mean, estimated as the sum of squared deviation divided by the degrees of freedom. If you forgot, you get the standard deviation if you take the square root of the variance.

F = Between groups variance / Within-group variance

F-tests use F-distribution

Recall how we used the t-distribution or binomial distribution to determine the probability where the null hypothesis was true. F-distribution, too, has a characteristic shape and is based on two parameters – the degrees of freedom 1 and 2, the ones used in the numerator and denominator, respectively.

In the case of the material strength problem we have been working out in the past two posts (four groups with ten samples each leading to df1 =4-1 = 3 and df2 = 4 x (10-1) = 36), the F-distribution appear in the following form.

One way to understand the above plot is to imagine you are repeating the sampling several times (keeping for vendors and taking ten samples each so that df1 and df2 remain the same), and the null hypothesis is true. You calculate the F values each time. Finally, if you plot the frequency of those F values, you get a plot similar to the one above.

F – Statistics Read More »

One-way ANOVA – by Hand

July 2, 2022

Let’s do the ANOVA step by step. We use the F-statistic to accept or reject the null hypothesis by comparing it with the critical F value. Once you get the F-value, you can calculate the p-value based on a significance level.
The definition of F-statistic is

F = Between groups variance / Within-group variance

Between groups variance

Here, you are estimating the variation of the group statistic from the global statistic. In other words, you determine the means of each group and the global mean (of all data or the mean of means). The estimate the difference, square, add up and divide by the degree of freedom like you do standard variance.

Recall the previous example (strength of materials by four vendors). So you have four groups, each containing ten samples. First, estimate four means and the global mean. They are:

Vendor	Vendor 1	Vendor 2	Vendor 3	Vendor 4
Mean	11.2	8.94	10.68	8.84
Samples	10	10	10	10
Global mean (= 9.915)
Square for factor	10*(11.2-9.915)²	10*(8.94-9.915)²	10*(10.68-9.915)²	10*(8.84-9.915)²
Sum Square for factor (= 43.62)
Degrees of freedom (DF = 4 -1 = 3)

The numerator (mean squares of factor) is calculated by dividing the sum square of factor with the degrees of freedom, i.e., 43.62/3 = 14.54.

Within-group variance

Here, you add up all the variations inside the groups. Add them up and then divide by the sum of the degrees of freedom of each group.

Vendor	Vendor 1	Vendor 2	Vendor 3	Vendor 4
Samples	10	10	10	10
Degrees of Freedom (sample – 1)	9	9	9	9
Within group Squares for error (variance x df)	35.81 (3.98 x 9)	79.93 (8.88 x 9)	10.94 (1.22 x 9)	31.78 (3.53 x 9)
Sum Within group Squares for error (= 158.466)
Total Degrees of Freedom (= 36)

The denominator (mean squares of error) is calculated by dividing the sum within group squares for error with the total degrees of freedom, i.e., 158.466/36 = 4.402.

F – Statistics = 14.54 / 4.402 = 3.30

The 3.30 is then compared with the critical F-value corresponding to a set significance level, 0.05, in the present case. You can either look up at the F distribution table or use the R function.

qf(0.05, 3,36, lower.tail=FALSE)

The critical value is 2.87. Since the F-statistics in our case is larger than 2.87, we reject the null hypothesis. The p-value turned out to be 0.031.

pf(3.303, 3, 36, lower.tail = FALSE)

One-way ANOVA – by Hand Read More »

One-way ANOVA

July 1, 2022

We know how to use a 1-sample t-test to perform a hypothesis test on the mean of a single group and a 2-sample t-test to compare two groups. The scope of t-tests ends here as two is the limit. What happens when you have three groups of data? If the number is two or more, you will use the analysis of variance or ANOVA. As we have done before, we will do an ANOVA using R programming.

Comparing four vendors

ANOVA requires an independent variable (categorical factor) and a dependent variable (continuous). The following table will tell you what I meant. The data used in the analysis is taken from https://statisticsbyjim.com/. The strength of certain materials from four vendors is available, and we are determining if there is a statistically significant difference between the mean strengths. Here is how a few selected entries appear (out of 40).

Vendor	Strength
Vendor 1	11.71
Vendor 1	11.981
Vendor 1	8.043
Vendor 2	7.77
Vendor 2	10.74
Vendor 2	10.72
Vendor 3	9.65
Vendor 3	8.79
Vendor 3	10.86
Vendor 4	6.97
Vendor 4	9.16
Vendor 4	8.67

Note that the categorical factor, vendor names (1, 2, 3 and 4), divided the continuous data into four groups. Before performing ANOVA, we plot the data and check how they look.

par(bg = "antiquewhite")
boxplot(AN_data$Strength ~ factor(AN_data$Vendor), xlab = "Vendor", ylab = "Strength of Material")

The null and alternate hypotheses are:

N₀ = four mean strengths are equal
N_A = four mean strengths are not equal

Doing ANOVA is pretty easy; the following commands will do the job.

sum_aov <- aov(AN_data$Strength ~ factor(AN_data$Vendor))
summary(sum_aov)

                      Df Sum Sq Mean Sq F value Pr(>F)  
factor(AN_data$Vendor)  3  43.62  14.540   3.303 0.0311 *
Residuals              36 158.47   4.402                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We are testing the statistical significance by the F-test. And here is the most important thing, the p-value is 0.03, less than the 0.05 we chose. So the null hypothesis is rejected. We will see what they all mean, in the next post.

Hypothesis Testing: An Intuitive Guide: Jim Frost

One-way ANOVA Read More »

Based on a Lancet Study …

June 30, 2022

In this post, we discuss an article that otherwise requires no special mention in this space. Yet, we discuss it today, perhaps as an illustration of 1) the diverse objectives that scientific researchers set for their work and 2) how the ever-imaginative media, and subsequently the public, could interpret the messages. Before we examine the motivation or the results, we need to understand something about the study’s publication status.

Preprints with The Lancet

It is a non-peer-reviewed work or preprint and, therefore, is not a published article in the Lancet, at least for now. The SSRN page, the repository at which it appeared, further states that it was not even necessarily under review with a Lancet journal. So, a preprint with The Lancet is not equivalent to a publication by the Lancet.

The motivation

You may read it from the title: Randomised clinical trials of COVID-19 vaccines: do adenovirus-vector vaccines have beneficial non-specific effects? It is a review paper, and the investigators specifically wanted to understand the impact of COVID-19 vaccines on non-COVID diseases, which, I think, is a valid reason for the research. By the way, you have every right to ask why COVID-19 vaccines should impact accidents and suicides!

Motivated YouTubers

The following line from the abstract turned out to be the key attraction for the YouTuber scientist. It reads: “For overall mortality, with 74,193 participants and 61 deaths (mRNA:31; placebo:30), the relative risk (RR) for the two mRNA vaccines compared with placebo was 1.03“. Now, ignore the first three words, “For overall mortality”, add The Lancet, and you get a good title and guaranteed clicks!

The results

First, the results from mRNA vaccines (Pfizer and Moderna):

Cause of death	Death/total Vaccine group	Death/total Placebo group	Relative Risk (RR)
Overall mortality	31/37110	30/37083	1.03
Covid-19 mortality	2/37110	5/37083	0.4
CVD mortality	16/37110	11/37083	1.45
Other non-Covid-19 mortality	11/37110	12/37083	0.92
Accidents	2/37110	2/37083	1.00
Non-accidents, Non-Covid-19	27/37110	23/37083	1.17

In my opinion, the key messages from the table are:
1) The number of deaths due to Covid-19 is too small to make any meaningful inference
2) The deaths due to other causes show no clear trends upon vaccination

Results from adenovirus-vector vaccines (several studies combined):

Cause of death	Death/total Vaccine group	Death/total Placebo group	Relative Risk (RR)
Overall mortality	16/72138	30/50026	1.03
Covid-19 mortality	2/72138	8/50026	0.4
CVD mortality	0/72138	5/50026	1.45
Other non-Covid-19 mortality	8/72138	11/50026	0.92
Accidents	6/72138	6/50026	1.00
Non-accidents, Non-Covid-19	8/72138	16/50026	1.17

My messages are:
Accidental accumulation of non-Covid-19-related deaths (five of them coming from cardiovascular) gives an edge to the vaccine group and, therefore, “saves” people immunised with Adenovirus-vector vaccines from dying from other causes, including accidents, in some countries! The statistical significance of the number of cases is dubious.

Lessons learned

1) Be extremely careful before accepting commentaries about scientific work (including this post)
2) As much as possible, find out and read the original paper after being enlightened by YouTube teachers.

Randomised clinical trials of COVID-19 vaccines: do adenovirus-vector vaccines have beneficial non-specific effects?: Benn et al.

Based on a Lancet Study … Read More »

Risks vs Benefit – mRNA Against CoVid-19

June 29, 2022

You may read this post as the continuation of the one I made last year. Evaluate the risk caused by an action by comparing it with situations without that action. That is the core of the risk-benefit trade-off in decision-making. A third factor is missing in the equation, namely, the cost.

A new study published in The Lancet is the basis for this post. The report compiles the incidents of myocarditis and pericarditis, two well-known side effects linked to the mRNA vaccines against COVID-19. The data covered four health claim databases in the US and more than 15 million individuals.

The results

First, the overall summary: the data from four Data Partners (DP) indicate 411 events out of the 15 million studied who received the vaccine. Details of what is provided by each of the DPs are,

Data Partner (DP)	Total vaccinated	Total Observed myocarditis or pericarditis events (O)	Expected events (E) (based on 2019)	O/E
DP1	6,245,406	154	N/A	–
DP2	2,169,398	64	24.96	2.56
DP3	3,573,097	94	40.08	2.35
DP4	3,160,468	99	44.61	2.22

I don’t think you will demand a chi-squared test to get convinced that the two mRNA vaccines have an adverse effect on heart health. Age-wise split of the data gives further insights into the story.

Age-group	Observed Events	Total vaccinated	Incident Rate (per 100,000)	Expected Rate (per 100,000)
18-25	153	1,972,410	7.76	0.99
26-35	62	2,587,814	2.40	0.95
36-45	63	3,226,022	1.95	1.11
46-55	62	3,597,292	1.72	1.3
56-64	71	3,764,831	1.89	1.63

The relative risk is much higher for younger – 18 to 35 – age groups. But the absolute risk of the event is still in the single digits per hundred thousand. And this is where we should look at the risk-benefit-cost trade-off of decision-making.

The risk

First and foremost, don’t assume all those 411 individuals died from myocarditis or pericarditis; > 99% recover. To know that, you need to read another study published in December 2021 that reported the total number of deaths to just 8! So, there is a risk, but the absolute value is low. The awareness of the risk should alert the recipients that any discomfort after the vaccination warrants a medical checkup.

The benefit

It would be a crime to forget the unimaginable calamity that disease has brought to the US, with more than a million people dying from it. A significant portion of those deaths happened prior to the introduction of the vaccines, and even after, the casualties were disproportionately harder on the unvaccinated vs the vaccinated.

The cost

At least, in this case, the cost is a non-factor. Vaccine price, be it one dollar or 10 dollars, is way lower than the cost of the alternate choices, buying medicines, hospitalisation or death.

Managing trade-off

Different countries manage this trade-off differently. Since the risk of complications due to COVID-19 is much lower for children and the youth, some allocate a lower priority to the younger age groups or assign a different vaccine. However, it is recognised that avoiding their vaccination altogether, due to their low-risk status, is also not an answer to the problem. It can elevate the prevalence of illness in the system and jeopardise the elders with extra exposure to the virus.

References

Risk of myocarditis and pericarditis after the COVID-19 mRNA vaccination in the USA: The Lancet

Myocarditis after COVID-19 mRNA vaccination: Nature Reviews Cardiology

How to Compare COVID Deaths for Vaccinated and Unvaccinated People: Scientific American

Risks vs Benefit – mRNA Against CoVid-19 Read More »

The Ultimatum Game – The Kahneman Experiment

June 27, 2022

In yet another Kahneman experiment, the team tried to play the ultimatum game with a group of psychology and business administration students. If you forgot what the game was, here is the description.

The game

Experiment 1

In their experiment, player A got paired with player B at random. There were several pairs. Each duo got $10 that could be divided between the two as proposed by one of the pairs. If player A allocated the division and was acceptable to player B, the payoffs were done accordingly. If the proposed division was unacceptable to player B, neither got anything.

Much to the surprise, because it violated the standard game theory prediction, the researchers found that the majority (75%) of the participants split the offers equally. There were also rejections of some of the proposals.

Experiment 2

The experiment had two parts. The first part was the ultimatum game with a few differences. The subjects only got two possibilities to divide $20: 18:2 or 10:10. And the receiver had no option to reject. In the second part, the participants were matched with two others. She then got a chance to split $12 evenly between herself and the person (the unfair one) who gave away $2 in the previous game (if one of them happened to be in the match) or to split $10 evenly with the even-splitters (the fair ones) of the earlier part.

76% of the people split evenly in the first part of the experiment. In the second part, there was a clear preference (74%) to punish the unfair allocators even when that would mean a $1 cost to the allocator.

The Ultimatum Game – The Kahneman Experiment Read More »

The Ultimatum Game

June 26, 2022

Adam Grant, in his best-selling book Give and Take, describes the behavioural characteristics of three types of humans based on their attitudes towards other people – takers, matchers and givers. According to the author, takers give away (money, service or information) when the benefits to themselves are far more than the personal costs that come with the transfer. Givers, on the other extreme, relish the value to others more than the personal cost to themselves. Naturally, the matchers are in between – strictly reciprocating.

Grant reference to a paper published by Kahneman et al. in 1986 based on a concept called the ultimatum game, a well-known idea in game theory. Today, we will look at the game. We’ll discuss the study results another day.

The game

We will illustrate the concept through a 100-dollar game. Player 1 (donor) gets 100 dollars, and she can offer – anything from 0 to 100 – to player 2 (receiver). If player 2 accepts, she gets it, and player 1 takes the rest (100 – X). If player 2 rejects the offer, then no one gets anything.

Rationality vs sense of fairness

If the receiver was rational, her actions would have been governed by her self-interest, as expected by economic theories, and she would have taken whatever was offered. After all, something is better than nothing. But this doesn’t always happen. There is a limit to the offer below which the receiver may feel the donor’s injustice.

Gambler’s Ruin

June 21, 2022

Similar to the previous two posts, although the premise is slightly different. A gambler starts with n dollars and bets dollar 1 at a time. She will quit under one of the two circumstances – 1) lose it all (0 dollars) or 2) reaches the target of N dollars.

Let’s first understand the conditions. You go to a casino and play even money on a Roulette wheel (payoff 1 to 1). You have 10 dollars in your purse, which is your capital. You start betting 1 dollar at a time. If you win, you add a dollar to the capital, and if you lose the bet, you lose one from it. When you reach your target, say 100, or lose all your money, you quit and go home. It is easy to realise that you can not start if your starting capital is 0 or 100. In the former case, you don’t have money to bet, and in the latter, you have already achieved the target!

Random walk

We use the random walk method to establish an analytical relationship for the probability. Imagine a random walk starts from a position x_j, corresponding to a starting fortune of j dollars. Depending on a win, loss, or a tie, the person will move to x_j+1, x_j-1 and x_j with probabilities of p, q and r, respectively. Therefore,

x_j = P(A_j|win) x P(win) + P(A_j|loss) x P(loss) + P(A_j|tie) x P(tie)

x_j = x_j+1 x p + x_j-1 x q + x_j x r

Total probability, p + q + r = 1; r = 1 – (p + q)

x_j = x_j+1 x p + x_j-1 x q + x_j – (p+q) x_j

p x_j+1 – (p+q) x_j+ q x_j-1 = 0

It is a quadratic equation for x_j = k^(j=1) . By substituting the values and performing the necessary manipulations, you get the final probability of reaching the target (quitting at N or 0).

$P = \frac{1-(q/p)^n}{1-(q/p)^N}; p \neq q$

For even-money bets

The winning probability is just under 50% (18/38). The chances of achieving your target of 100 from four different starting points, 10, 25, 50, and 93.5, are:

Starting Amount	Probability (to reach 100)
10	0.00005
25	0.0003
50	0.005
93.5	0.5

^{At 93.5, you have a 50:50 chance to make 100!}

Bold vs cautious

An important takeaway from this calculation is the strategy of how you may want to bet to maximise your chance of reaching 100. E.g., you start with 10 dollars and have two choices: 1) place 10-dollar bets or 2) place 1-dollar bets. In the first case, you bet ten times, and in the second case, a hundred.

$\text{The probability of winning 100 in 10 dollar bets starting with 10 is (n = 1 and N = 10)} \\ \\ x_{10} = \frac{1-[(18/38)/(18/38)]^1}{1-[(18/38)/(18/38)]^{10}} = 0.06 \\ \\ \text{The probability of winning 100 in 1 dollar bets starting with 10 is (n = 10 and N = 100)} \\ \\ x_{10} = \frac{1-[(18/38)/(18/38)]^{10}}{1-[(18/38)/(18/38)]^{100}} = 0.00005$

You better be bold and play larger sums fewer times than otherwise. Well, it is not new; the house always wins in the long term!

Gambler’s Ruin Read More »

Data & Statistics

Hypotheses

Input the data

Plot the data

F-test for ANOVA

Reject null hypothesis

Tukey’s test for multiple comparisons of means

Interpreting pair-wise differences

Reference

Family error rate

Four groups, six comparisons

F-tests use F-distribution

Between groups variance

Within-group variance

Comparing four vendors

Preprints with The Lancet

The motivation

Motivated YouTubers

The results

Lessons learned

The results

The risk

The benefit

The cost

Managing trade-off

References

The game

Experiment 1

Experiment 2

The game

Rationality vs sense of fairness

Further Reading

Random walk

For even-money bets

Bold vs cautious