All

Amy’s Job

Amy got short-listed for three job interviews. The total number of candidates appearing for the three jobs are 5, 3 and 4. Assuming all the candidates are equally competent, what is Amy’s chance of getting at least one job?

Step 1: Assume probabilities are independent. 

Step 2: Estimate the probabilities of getting rejected in each job. i.e., 1 – 1/5 = 4/5, 1 – 1/3 = 2/3, and 1 – 1/4 = 3/4. 

Step 3: Calculate the joint probability of getting left in all jobs. (4/5)x(2/3)x(3/4) = 0.4

Step 4: Probability of getting at least one job = 1 – probability of getting rejected from all jobs, i.e., 1- 0.4 = 0.6

Amy’s Job Read More »

One-Sample Poisson: Car Breakdown

A car model breaks down on average 1.5 times a year. The company has developed a fix that claims to have reduced the issue. Alby randomly selects ten cars of the new model and finds eight of them break down in the first year. Did the fix work? Use a significance level (alpha) of 5%.

Since the subject represents counts (car breakdowns) that occur at random, we will use the Poisson Hypothesis testing here.

The null hypothesis, H0 = the average failure rate (lambda) of the new car = 1.5 (same as old)
The alternate hypothesis, HA = the average failure rate (lambda) of the new car < 1.5 (failure reduced)

The R code has the following format: poisson.test(total count, duration, hypothesized rate, region of the alternative)

poisson.test(8, 10, 1.5, alternative = "less")
	Exact Poisson test

data:  8 time base: 10
number of events = 8, time base = 10, p-value = 0.03745
alternative hypothesis: true event rate is less than 1.5
95 percent confidence interval:
 0.000000 1.443465
sample estimates:
event rate 
       0.8 

Since we used the p-value as the criterion and it is less than the significance level (0.05), we reject the null hypothesis in favour of the notion that the fault has been reduced.

Reference

Hypothesis Testing with the Poisson Distribution

One-Sample Poisson: Car Breakdown Read More »

Two-Sample Poisson Test

Two batches of products have come from a factory with the following defect counts. Find out whether one batch made fewer defects than the other batch.

Total number of samples = 30 each
Rate occurrences = 107/30 = 3.56 and 161/30 = 5.36

poisson.test(c(107, 161), c(30, 30))
Comparison of Poisson rates

data:  c(sum(r_data$Supplier.1), sum(r_data$Supplier.2)) time base: c(30, 30)
count1 = 107, expected count1 = 134, p-value = 0.001166
alternative hypothesis: true rate ratio is not equal to 1
95 percent confidence interval:
 0.5155166 0.8539201
sample estimates:
rate ratio 
 0.6645963 

There is a difference between the two batches of samples.

Comparing Hypothesis Tests for Continuous, Binary, and Count Data: Statistics by Jim

Two-Sample Poisson Test Read More »

One-Sample Poisson Test

The city council claims their recent road safety campaign has reduced the daily accident rate. The following are the daily data collected over 20 days. The mean rate before the campaign was 5.

4, 6, 4, 1, 1, 5, 5, 6, 3, 5, 1, 8, 3, 2, 5, 7, 5, 2, 3, 4

The first thing to realise here is that the number of accidents is entirely random, although it may revolve around a mean (rate). Therefore, the hypothesis tests based on normal distribution, such as t.test, are not applicable here. We use the Poisson test on such occasions.

poisson.test(80, 20, 5, alternative = "less")

Here, 80 is the sum of the counts, and 20 is the total duration (days) over which the samples were collected.

	Exact Poisson test

data:  sum(x) time base: 20
number of events = 80, time base = 20, p-value = 0.02265
alternative hypothesis: true event rate is less than 5
95 percent confidence interval:
 0.000000 4.817502
sample estimates:
event rate 

The p-value = 0.022, and we reject the null hypothesis, H0 (that the event rates are equal), at a significance level of 5%.

One-Sample Poisson Test Read More »

Bayesian Betting

The horses A, B and C have the following amounts on bets:

Horse 1$10000
Horse 2$15000
Horse 3$25000

How much should the track pay a bettor for winning on a $2 bet? Note the track will take 10% as their profit.

The first thing is to determine the odds. A Bayesian is not so worried about coming up with one. The simplest method is to go for the gut feeling. Another way is to estimate the win probability based on the bet amount (individual/total) and then determine the odds as,
(1 – probability) / probability

HorseBet AmountProbabilityOdds
Horse 1$1000010000 / 50000
= 0.2
4 to 1
Horse 2$1500015000 / 50000
= 0.3
2.3 to 1
Horse 3$2500025000 / 50000
= 0.5
1 to 1

For a $2 bet on horse 1, the payout will be 2 x 4 – 0.1 x 2 x 4 = $7.2

Introduction to Mathematical Statistics: PennState

Bayesian Betting Read More »

The Happiness Formula – Money Matters

The studies on the correlation between income and happiness have a bit of history. In 2010, a study led by Daniel Kahneman found that the happiness of individuals increases with (log) income until about $75,000 per year and then flattens out. However, the work of Killingsworth (2021) showed contradictory results where happiness just followed a linear trend with the log (income).

Join forces!

The original study of Kahneman and Deaton had survey responses from about 450,000 US residents that captured answers on their well-being.

Killingsworth’s work, on the other hand, had 1,725,994 reports of well-being from 33,391 employed adults (ages 18 to 65) living in the US. It found happiness advancing linearly even beyond $200,000 per year.

The conflict prompted an ‘adversarial collaboration‘ with Barbara Mellers as the facilitator.

The hypothesis

They started with a hypothesis for the test:
1) There is an unhappy minority whose unhappiness reduces with income up to a threshold, then shows no further progress.
2) A happier majority whose happiness continues to rise with income.

The ‘joint team’ stratified Killingsworth’s data into percentiles and here is what they found:

Percentile of
happiness
Slope up to $100kSlope above $100k
5% (least happy)2.340.25
10%1.750.52
15%1.900.34
20%1.840.62
25%1.521.12
30%1.331.21
35%1.261.21

References

Matthew Killingsworth; Daniel Kahneman; Barbara Mellers, “Income and emotional well-being: A conflict resolved”, PNAS, 2023, 120(10).

Daniel Kahneman; Angus Deaton, “High income improves evaluation of life but not emotional well-being”, PNAS, 2010, 107(38).

Matthew Killingsworth, “Experienced well-being rises with income, even above $75,000 per year”, PNAS, 2021, 118(4).

The Happiness Formula – Money Matters Read More »

The fallacy of Hindsight and FOMO

The fallacy of hindsight – the feeling that the fact was evident after the outcome is known – is a significant factor that undermines the truth of probability and risk in decision-making. The occurrences of hindsight bias, or the ‘I knew it’ moments, are prominent when the results are adverse. As per scientists Neal Roese and Kathleen Vohs, there are three bias levels. They are:

  1. Memory distortion (not remembering the earlier opinion)
  2. Inevitability (the event was inevitable)
  3. Foreseeability (the conviction that the person knew it beforehand)

When hindsight bias is all about dismissing the past decision-making process after a negative result, the ‘fear of missing out’ (FOMO) is the intrinsic motivation to act due to (the memories) positive outcomes of the past. Although the term FOMO was initially introduced (in 2004) to describe people’s compulsive behaviour on social networking sites, it is pervasive in several walks of life, including decision-making, investing, and trading, to name a few.

Issues of hindsight bias

The biggest concern is that it discounts the role of probabilities and trade-offs in decision-making, leading to labelling the initial decision-makers as a joke. If you recall the expected value theory, it is almost like forgetting about the probability of failure of the equation. It is more dangerous than just finger-pointing. Hindsight bias causes overconfidence in individuals and decreases rational thinking while navigating complex problems. ‘Always feeling wise’ also reduces one’s ability to learn from mistakes.

And the FOMO

FOMO is just the opposite of what a value investor may want to do. Typically, FOMO leads to chasing stocks during a bull run of the market, or perhaps the very reason for the bull market! While a few lucky ones may be able to cash during the market run, most people with ‘buying high’ end up losing on the crash. FOMO can create collective anxiety in organisations about missing investment opportunities, especially with speculations about the possibilities of ‘things’ happening elsewhere.

References

Hindsight bias: Wiki

Fear of missing out: Wiki

The fallacy of Hindsight and FOMO Read More »

The Bayesian Cars

After a break, we are back with a Bayesian problem. It is taken from the Penn State site and combines the Bayes rule with Poisson probabilities.

Amy thinks the average number of cars passing an intersection is 3 in a given period, whereas Becky thinks it’s 5. On a random day of data collection, they observed seven cars. What are the probabilities for each of their hypotheses?

\\ P(\lambda = 3|X = 7) = \frac{P(X = 7 | \lambda = 3) * P(\lambda = 3)}{P(X = 7 | \lambda = 3) * P(\lambda = 3) + P(X = 7 | \lambda = 5) * P(\lambda = 5)}

Let’s give equal prior probabilities for both (P(lambda = 3) P(lambda = 5) = 0.5). P(X = 7 | lambda = 3) is given by the Poisson probability density function:

dpois(7, 3)
dpois(7, 5)
0.022
0.105

\\ P(\lambda = 3|X = 7) = \frac{0.022 * 0.5}{0.022 * 0.5 + 0.105 * 0.5} = 0.173

\\ P(\lambda = 5|X = 7) = \frac{0.105 * 0.5}{0.105 * 0.5 + 0.022 * 0.5} = 0.826

The Bayesian Cars Read More »

The Power Function 

In the last post, we have seen the power of the hypothesis test for a mean, which is away from the null hypothesis. What happens if the true mean is further away, say 110 instead of 108? Let’s run the following code and you get it.

power_calc <- function(alt){

Null_mu <- 100
sigma <- 16
N_sample <- 16
alpha <- 0.05
Alt_mu <- alt

z <- qnorm(alpha, 0, 1, lower.tail = FALSE)
X_05 <- z * sigma /sqrt(N_sample) + Null_mu
Z_cum <- (X_05 - Alt_mu) / (sigma/sqrt(N_sample))
pnorm(Z_cum, 0, 1, lower.tail = FALSE)  

}

power_calc(110)

You see, the power is increased from 0.64 to 0.8. The whole spectrum of power values from the null hypothesis mean (100) all the way to 120 is shown below.

When alpha is reduced, the power is also reduced (type II error increases).

Reference

Power Functions: PennState

The Power Function  Read More »

IQ Power

Let’s assume the IQ of a population is normally distributed with a standard deviation of 16. A hypothesis test collected 16 samples for the null hypothesis of mean IQ = 100 for a significance level of 5%. What is the power of the hypothesis test if the true population mean is 108?

Definition: The power of a test is the probability that it will correctly reject the null hypothesis when the null hypothesis is false.

Step 1: Estimate the Z-score for the alpha (significance level)

qnorm(0.05, 0, 1, lower.tail = FALSE)
1.645

Step 2: Estimate IQ corresponds to Z = 1.645

\\ Z = \frac{\hat{X} - \mu}{\sigma/\sqrt{n}} \\ \\ \hat{X} = \frac{Z \sigma}{\sqrt{n}} + \mu

1.645 * 16 / sqrt(16) + 100 =  106.58

Above the IQ average of 106.58, the Null hypothesis (that the mean = 100) will be rejected.

Step 3: Estimate Z-score at X = 106.58 for mean = 108

Z = \frac{106.58 - 108}{16/\sqrt{16}} = -0.355

The entire area above Z = -0.355 is included in the power region (the area below Z = -0.355 will be the false negative part as the null hypothesis will not be rejected).

Step 4: Estimate the cumulative probability > Z = -0.355

pnorm(-0.355, 0, 1, lower.tail = FALSE)
 0.639

The power is 0.639 or 64%

Reference

Power Functions: PennState

IQ Power Read More »