April 2022

Waiting-Time Paradox

April 30, 2022

Have you ever wondered why you always had to wait longer at the bus stop? If you agree with that but feel it was more of a mental thing than anything else, hold on a bit. There are chances that your feeling is valid and may be explained using probability theory.

Remember the inspection paradox?

See the previous post if you don’t know what I am talking about. The waiting time paradox is a variation of the inspection paradox. And we will see how, so brush up on your probability and expected value basics. As a one-line summary, the expected value is the average value!

A bus every 10 minutes

You know a bus for your destination comes every 10 minutes at a bus stop. In other words, six buses every hour. You start with the premise that the average waiting time is five minutes, assuming you randomly reach the stop. One day, at one minute for the next bus (waiting time 1 min) or another day, a minute after the last bus (waiting time = 9 min), etc.

Do not forget that buses also come with certain uncertainties (unless your pick-up is at the starting station). Now, let’s get the arrival times of this bus at a given stop (at some distance after the starting point). I can make them up for illustration but will, instead, resort to the Poisson function and get random time intervals between two consecutive buses.

Here they are: made using the R code, “rpois(6, 10)“: 10, 10, 11, 3, 10, 16. These are minutes, inside an hour, between buses, created randomly. The code [rpose(N = sample Size, lambda = expected interval)] randomly generated 6 intervals at a mean value 10.

The average waiting times inside these six slots are 10/2, 10/2, 11/2, 3/2, 10/2 and 16/2. You know what happens next.

Probabilities and expectations

Compute your probability of catching a bus during that hour. They are 10/60, 10/60, 11/60, 3/60, 10/60 and 16/60. Each number on the numerator corresponds to a slot and a time between two buses, which equals the respective waiting time, as described in the previous paragraph.

The expected value (average waiting time) = Probability x average waiting time corresponding to that probability = (10/60) x (10/2) + (10/60) x (10/2) + (11/60) x (11/2) + (3/60) x (3/2) + 10/60 x (10/2) + (16/60) x (16/2) = 5.72.

Average is > 5 minutes

The following code summarises the whole exercise.

number_of_buses <- 100 
avg_wait_buses <- 10

bus_arrive <- rpois(number_of_buses, avg_wait_buses )
prob_bus_slot <- bus_arrive/sum(bus_arrive)
avg_wait_slot <-  bus_arrive / 2
Expected_wait <- sum(prob_bus_slot*avg_wait_slot)

Waiting-Time Paradox Read More »

Inspection Paradox

April 29, 2022

Suppose a school has three classes, A, B and C, holding 20, 60 and 100 students. What is the average number of students in a class? Well, it depends on who you ask. Let’s understand this deeper.

If you ask the school principal, she will give you 60 as the average number. It’s simple, (20 + 60 + 100) / 3 = 60. But what happens if you choose to sample 50 students and ask them about the number of students in their classes and then average it?

You wait at the school bus stop and randomly select 50 students. Since the sampling is random, you are likely to catch the following numbers from each class.

The probability of finding a student from class A = (20/180). Therefore, the number of students from class A in the total sample of 50 = 50 x (20/180) = 5.55, about 6.

Similarly, from class B, you catch 50 x (60 / 180) = 16.66 or about 17 students, and from class C, 50 x (100 / 180) = 27.77 or about 28 students.

So what are the responses from the students? 5.55 will say 20 students in their class because they are from class A. 16.66 will say 60, and 27.77 says 100. So, the survey average is (5.55 x 20 + 16.66 x 60 + 27.77 x 100) / (5.55 + 16.66 + 27.77) = 3887.6 / 50 = 77.8.

So from the school, you get 60, and from the survey, 78, and no one is lying.

Inspection Paradox Read More »

Wisdom of the Crowd

April 28, 2022

Wisdom of the crowd is an intriguing phenomenon, which received its popularity in public discourse due to Galton when he surprisingly found a near-accurate estimate of the weight of a prize-winning ox by the common public. In other words, whatever errors in the guesses of the individuals got cancelled when averaged. An essential prerequisite of achieving the group’s wisdom is that the errors of individuals are statistically independent of each other.

Vul and Pashler took this to the next level. They wanted to test it for single individuals by allowing them to make multiple guesses. They asked 428 participants (through an internet poll) eight questions regarding general knowledge. Half of the participants were asked to make a second guess immediately after the questionnaire, and the other half were asked after three weeks (without telling them it would come after three weeks).

The results were surprising. The within-person average of the guesses (or the mean square error) was lower than either of the guesses for the immediate responders. The value was even lower for late-responders.

Wisdom of the Crowd Read More »

One-Son Policy

April 27, 2022

You may have heard about the one-child policy. Imagine if a country decides to have a one-son policy. The question is: will it lead to a gender imbalanced society? Intuition suggests that there will be more girls in the land because of this policy. Let’s analyse this by estimating the expected values.

The analytical solution

Assume that the probability of either sex is equal, 1/2. We have seen earlier that if p is the probability of something, then the expected value (denoted by E) or the average time we need to wait for the first success is 1/p. In our case, if N is the number of children in a family with 1/2 as the parameter, E(N) = 1/(1/2) = 2. Let N_m be the number of male children and N_f be the number of females, N = N_m + N_f and E(N) = E(N_m) + E(N_f) = 2. E(N_m) for a one-son policy regime has to be 1. That means E(N_f) = 2 – 1 = 1. So the expected number of males and females are equal! Not convinced? We will illustrate with examples.

Testing the concept

1 child scenario: The simplest case is one child in a family. The child can be M (male) or F (female). p(M) = 1/2, p(F) = 1/2. N_m(M) = 1, N_m(F) = 0, N_f(M) = 0, and N_f(F) = 1.

E(N_m) = N_m(M) x p(M) + N_m(F) x p(F) = 1 x (1/2) + 0 x (1/2) = 1/2.
E(N_f) = N_f(M) x p(M) + N_f(F) x p(F) = 0 x (1/2) + 1 x (1/2) = 1/2.

^{So the expected values of males and females are equal}

A bit of explanation on the notation:
N_m(M) is the number of boys in an M scenario, and N_m(F) is the number of boys in an F scenario.
N_f(M) is the number of girls in an M scenario, and N_f(F) is the number of girls in an F scenario.

Maximum 2 children in a family: The combinations are M, FM, FF.
p(M) = 1/2, p(FM) = 1/4, p(FF) = 1/4
N_m(M) = 1, N_m(FM) = 1, N_m(FF) = 0
N_f(M) = 0, N_f(FM) = 1, N_f(FF) = 2

E(N_m) = N_m(M) x p(M) + N_m(FM) x p(FM) + N_m(FF) x p(FF) = 1 x (1/2) + 1 x (1/4) + 0 x (1/4) = 3/4.
E(N_f) = N_f(M) x p(M) + N_f(FM) x p(FM) + N_f(FF) x p(FF) = 0 x (1/2) + 1 x (1/4) + 2 x (1/4) = 3/4.

^{Again, the expected values of males and females are equal}

Maximum 3 children in a family: The combinations are M, FM, FFM, FFF.
p(M) = 1/2, p(FM) = 1/4, p(FFM) = 1/8, p(FFF) = 1/8
N_m(M) = 1, N_m(FM) = 1, N_m(FFM) = 1, N_m(FFF) = 0
N_f(M) = 0, N_f(FM) = 1, N_f(FFM) = 2, N_f(FFF) = 3

E(N_m) = N_m(M) x p(M) + N_m(FM) x p(FM) + N_m(FFM) x p(FFM) + N_m(FFF) x p(FFF)
= 1 x (1/2) + 1 x (1/4) + 1 x (1/8) + 0 x (1/8) = 7/8.
E(N_f) = N_f(M) x p(M) + N_f(FM) x p(FM) + N_f(FFM) x p(FFM) + N_f(FFF) x p(FFF)
= 0 x (1/2) + 1 x (1/4) + 2 x (1/8) + 3 x (1/8)= 7/8.

^{Once again!}

More Proof?

We will do one final time, where the family waits for a boy but stops at 4. The combinations are M, FM, FFM, FFFM, and FFFF.
p(M) = 1/2, p(FM) = 1/4, p(FFM) = 1/8, p(FFFM) = 1/16, p(FFFF) = 1/16.
N_m(M) = 1, N_m(FM) = 1, N_m(FFM) = 1, N_m(FFFM) = 1, N_m(FFFF) = 0
N_f(M) = 0, N_f(FM) = 1, N_f(FFM) = 2, N_f(FFFM) = 3, N_f(FFFF) = 4

E(N_m) = N_m(M) x p(M) + N_m(FM) x p(FM) + N_m(FFM) x p(FFM) + N_m(FFFM) x p(FFFM) + N_m(FFFF) x p(FFFF)
= 1 x (1/2) + 1 x (1/4) + 1 x (1/8) + 1 x (1/16) + 0 x (1/16) = 15/16.
E(N_f) = N_f(M) x p(M) + N_f(FM) x p(FM) + N_f(FFM) x p(FFM) + N_f(FFFM) x p(FFFM) + N_f(FFFF) x p(FFFF)
= 0 x (1/2) + 1 x (1/4) + 2 x (1/8) + 3 x (1/16) + 4 x (1/16) = 15/16.

^{Still, there is a balance between boys and girls}

One-Son Policy Read More »

Hot Hands of Randomness

April 26, 2022

A lot of basketball fans believe in hot hands and streak shooting. These phrases roughly suggest the probability of a basket is higher following a hit than a miss. A spectator suddenly gets confidence in a player who shot the previous two shots to make the next. At that moment, she would not think about probabilities (coin tossing games) but only consider the momentum and confidence of the shooter.

It is interesting to notice that this behaviour of the fans is just the opposite of the belief in “the law of small numbers” or the gambler’s fallacy. The reason: basketball is being played by individuals with flesh and blood (and lucky charm), whereas gambling is by machines that work on chances (and expected values).

In 1985, Gilovich et al. did a study that analysed the data from 48 home games of the Philadelphia 76ers and their opponents during the 1980-81 season. The following is what they found.

The first table is about the (conditional) probability of a hit, given the player had already hits (3, 2 and 1). P(hit) represents the overall shooting percentage of the player.

	P(hit)	P(hit\|3H)	P(hit\|2H)	P(hit\|1H)
Richardson	0.50	0.48	0.50	0.49
Erving	0.52	0.48	0.52	0.53
Hollins	0.46	0.32	0.46	0.46
Cheeks	0.56	0.59	0.54	0.55
Jones	0.47	0.27	0.43	0.45
Toney	0.46	0.34	0.40	0.43
Jones	0.54	0.53	0.47	0.53
Mix	0.52	0.36	0.48	0.51
Dawkins	0.62	0.51	0.58	0.57

Similar statistics for a player to hit after missing the previous shots are presented below.

	P(hit)	P(hit\|3M)	P(hit\|2M)	P(hit\|1M)
Richardson	0.50	0.50	0.47	0.56
Erving	0.52	0.52	0.51	0.51
Hollins	0.46	0.50	0.49	0.46
Cheeks	0.56	0.77	0.60	0.60
Jones	0.47	0.50	0.48	0.47
Toney	0.46	0.52	0.53	0.51
Jones	0.54	0.61	0.58	0.58
Mix	0.52	0.70	0.56	0.52
Dawkins	0.62	0.88	0.73	0.71

The data showed just the opposite of what people believed. The hits following previous hits were lower than previous misses.

Gilovich, T.; Vallone, R.; Tversky. A., Cognitive Psychology 17, 295-314 (1985)

Hot Hands of Randomness Read More »

Whose Fault was It?

April 25, 2022

A company gets their tools from three factories, 10% from F1, 50% from F2 and 40% from F3. The fault rates of the three factories are 2.5%, 1.5% and 1%, respectively. If the company finds one of the tools defective, from which factory is it likely?

Let’s write down things in the proper format and test the probabilities, one by one. Let P(F1|D) be the probability that the tool came from F1 given the information (that it is faulty), P(D|F1), the fault rate of F1 and P(F1) is the prior chance of F1 in the company.

$\\ P(F1|D) = \frac{P(D|F1)*P(F1)}{P(D|F1)*P(F1) + P(D|F2)*P(F2) + P(D|F3)*P(F3)} \\ \\ P(F2|D) = \frac{P(D|F2)*P(F2)}{P(D|F1)*P(F1) + P(D|F2)*P(F2) + P(D|F3)*P(F3)} \\ \\ P(F3|D) = \frac{P(D|F3)*P(F3)}{P(D|F1)*P(F1) + P(D|F2)*P(F2) + P(D|F3)*P(F3)}$

I guess you already know that I applied Bayes’ theorem to each factory. Now, add numbers and solve.

$\\ P(F1|D) = \frac{0.025*0.1}{0.025*0.1 + 0.015*0.5 + 0.01*0.4} = \frac{0.0025}{0.014} = 0.18\\ \\ P(F2|D) = \frac{0.015*0.5}{0.025*0.1 + 0.015*0.5 + 0.01*0.4} = \frac{0.0075}{0.014} = 0.54\\ \\ P(F3|D) = \frac{0.01*0.4}{0.025*0.1 + 0.015*0.5 + 0.01*0.4} = \frac{0.004}{0.014} = 0.29$

The tool is more likely to have come from F2.

Whose Fault was It? Read More »

Coin Tossing and NBA Playoffs

April 24, 2022

In the previous post, we have established the expected probabilities of the winning margins in the NBA playoffs. This time we compare them with the past data and test our hypothesis.

The null hypothesis (H₀) was that the games-outcomes occur with equal probabilities, irrespective of the rank in the regular season. The alternative hypothesis (H₁) says that the game outcomes are not at equal chances but based on the team’s strength.

Matches including round 1

We gather all the playoff outcomes until 2021, from the starting point of 2003, the beginning year of the best-of-seven series for round one. The summary of the observed results and expectations is in the following table.

	Observed (O)	Expected (E)	(O-E)²/E
4-game series	52	0.125 x 285 = 35.63	7.52
5-game series	74	0.25 x 285 = 71.25	0.106
6-game series	99	0.3125 x 285 = 89.06	1.109
7-game series	60	0.3125 x 285 = 89.06	9.48
Total	285	285	18.21

obsfreq <- c(52,74, 99, 60)
nullprobs <- c(0.125,0.25, 0.313, 0.312)
chisq.test(obsfreq,p=nullprobs)

Not among equals

With a chi² of 18.2 (> the critical value of 7.81 for the 5% significance level) and a p-value of 0.00042, we reject the null hypothesis that the playoff games were coin-tosses. The results are hardly surprising because of the presence of first-round playoffs in the data. Most of the matches in the first round are not played among equals. In the first round, the number one (of the regular season) plays against the number 8, 2 against 7, 3 against 6, and 4 against 5. With a closer review of the table, you can see far more 4-0 results (52) than expected (35) and fewer 4-3s (60) than coin-tosses (89).

After the first round

Once the screener of the first round is completed, it is the start of the conference semi-finals. We now anticipate that the matches are between equals or teams of comparable strength. We repeat the test by taking data from 2003 to 2021, excluding the first round. The analysis is below.

	Observed (O)	Expected (E)	(O-E)²/E
4-game series	19	0.125 x 132 = 16.5	0.38
5-game series	30	0.25 x 132 = 33	0.27
6-game series	52	0.3125 x 132 = 41.316	2.76
7-game series	31	0.3125 x 132 = 41.316	2.52
Total	132	132	5.93

The chi² is below the critical value (7.81), corresponding to a 5% significance level. Yes, the null hypothesis stays.

The final

We will now take it to the next level. Let’s cover only the NBA finals of the last 75 years. We expect the outcome to match that of matching probability events. And the expected chi² value could be the lowest we have seen so far. Keep your fingers crossed.

	Observed (O)	Expected (E)	(O-E)²/E
4-game series	9	0.125 x 75 = 9.37	0.015
5-game series	18	0.25 x 75 = 18.75	0.03
6-game series	29	0.3125 x 75 = 23.44	1.3
7-game series	19	0.3125 x 75 = 23.44	0.83
Total	75	75	2.17

Yes, it is – chi² of 2.17 with a p-value of 0.54.

Coin Tossing and NBA Playoffs Read More »

Are NBA Playoffs Random?

April 23, 2022

NBA playoff matches are progressing these days, and it is time we look at the winning probabilities. How significant is the role of chance in these games? Are the outcomes independent and random with equal probabilities for wins and losses, like coin-tossing?

Equal probability scenario

Although the title speaks about randomness, what we examine here is equal probability outcomes. We will formulate hypotheses, calculate expected probabilities and test them using historical data. Playoff matches take the best-of-seven format. In this structure, the team that reaches four wins first takes the round (series) and goes to the next.

Scenario 1: 4 games

There are only two ways the round can end in four games: if A wins all four or if B does the same. P(AAAA) = (1/2)⁴ = 0.0625 (if the games are independent). P(BBBB) = (1/2)⁴ = 0.0625. The probability that the game ends in four games is 0.0625 + 0.0625 = 0.125.

Scenario 2: 5 games

The possibilities of A winning in five games are BAAAA, ABAAA, AABAA and AAABA. The last option, AAAAB, doesn’t exist as team A got the necessary four before reaching the potential fifth. The probabilities of each of these outcomes are (1/2)⁵, and there are eight possible outcomes (4 each for A and B). The probability that the game ends in four games is 4x(1/2)⁵ + 4x(1/2)⁵ = 0.25.

Scenario 3: 6 games

6-game winning options for A are BBAAAA, BABAAA, BAABAA, BAAABA, ABBAAA, ABABAA, ABAABA, AABBAA, AABABA, AAABBA. As before, the sequences AAABAB, AAAABB etc. are irrelevant. The probability that the game ends in six games is 10x(1/2)⁶ + 10x(1/2)⁶ = 0.3125.

Scenario 3: 7 games

Estimating the probability of a seven-game series is easy. You will get it by subtracting all the previous ones from 1 = 1 – (0.125 + 0.25 + 0.3125) = 0.3125.

Chi² to the rescue

The null hypothesis, H0: The outcomes happen with equal probabilities, irrespective of the rank in the regular season. In other words, it is anybody’s game on a given day. The alternative hypothesis, H1, is that the game outcomes are not due to luck but based on the level of the team. We will do chi-squared statistics to test. That is next.

Are NBA Playoffs Random? Read More »

Birthday Problem Reloaded

April 22, 2022

We have seen earlier that the probability of two people sharing a birthday in a group of 40 is about 90%. And that is surprising to many of you. Part of the reason is that you start imagining someone sharing their birthday with yours. Well, that is quite another problem.

Probability of some sharing My birthday

The probability of someone sharing my (my, in inverted comma) is different. This problem is not as random as the previous one, as at least one person (i.e., me) is fixed! Let’s start with the inverse problem.

What is the probability that no one else has your birthday in a group of n people? In a group of 2, the chance is (364/365), and for 3, it is (364/365)*(364/365), and so on. In general, for a group of n people, the chance is (364/365)^n-1. Since the probability you share a birthday with one and the probability that you do not share are mutually exclusive events (both can not happen at the same time), you subtract one from 1 to get the other, or the answer is 1 – (364/365)^n-1.

For n = 10, the probability is 2.4%; for n = 100, it becomes 24%. To get a 50% chance that someone is sharing your birthday, you need 253 people in the room!

Birthday Problem Reloaded Read More »

Hypothesis Testing – Chi-Squared

April 21, 2022

The following table lists the weights of 50 boys (6-year-olds) sampled randomly. Can you test the hypothesis that the weights of 6-years old boys follow a normal distribution with a mean = 25 and a standard deviation = 2? We will do a chi-squared test to find out the answer to this.

28	24	27	24	27
26	25	29	22	24
23	25	21	22	25
26	27	27	26	29
28	27	22	23	21
29	24	23	23	22
25	22	29	28	30
24	28	26	25	25
28	29	26	27	30
22	31	25	24	27

The hypotheses

The null hypothesis, H₀, in this case: there is no difference between the observed frequencies and the expected frequencies of a normal distribution with mean = 25 and standard deviation = 2.

The alternative hypothesis, H_A: there is a difference between the observed frequencies and the expected frequencies of a normal distribution with mean = 25 and standard deviation = 2.

Estimation of chi²

Let us divide the data in the previous table into six groups of equal ranges. The frequencies of those ranges are counted. The expected frequency is estimated from the cumulative distribution function of the normal distribution for each of the ranges using the formula

E_i = n x [F(U_i) – F(L_i)]

n is the number of samples, F(U_i) is the upper limit of a range, and F(L_i) is the lower limit.

Range	Observed Frequency (O)	Expected Frequency (E)	(O-E)²/E
20 – 21	2	0.83	1.65
22 -23	10	6.65	1.69
24 -25	13	16.45	0.72
26 – 27	12	16.07	1.03
28 – 29	10	6.21	2.31
30 – 31	3	0.94	4.51
			11.92

The critical value at the 5% significance level and the p-value are estimated using the following R code.

qchisq(0.05, 5, lower.tail = FALSE)
pchisq(11.92, df=5, lower.tail=FALSE)

The critical value is 11.07, and the p-value is 0.036. Since the estimated chi-square (11.92) is outside the critical value, we reject the null hypothesis that the data follow the normal distribution with a mean = 25 and a standard deviation of 2.

Hypothesis Testing – Chi-Squared Read More »