Data & Statistics

The Arizona DNA Problem

If there is a 7.5% chance that two people share one spike (locus) of DNA, what is the chance two people share nine loci? Well, let it be (7.5/100)9 = 7.5 x 1011 or 1 in 13 billion! So a decent case for DNA match as forensic evidence!

Now the twist, an Arizona laboratory reported about 100 matches with nine loci of DNAs in a database of just over 60,000 samples. How is that possible? The first (1 in 13 billion) was an estimate, and this is data. So the estimation must be wrong by a zillion miles, right?

If you recall the birthday problem, you may realise this can’t be dismissed without further enquiry. Let’s start

Suppose there are 60,000 samples. What is the number of distinct pairs that can form from 60000? It is 60000C2 = 60000 x 59999 / 2 = 1,799,970,000. For each pair, how many ways to match 9 out of 13 loci? It is 13C9 = 13!/(4! x 9!) = 715. So the total number of 9 loci match = 1,799,970,000 x 715 = 1.286979 x 1012.

If the chance of 9 local matches of one pair is 1 in 13 billion, then the number of matches possible in 1.286979e+12 pairs is 1.286979 x 1012/13 x 109 = 99.

The Arizona DNA Problem Read More »

Coherent Arbitrariness

What determines the price of an object? If you are buying an asset, it could be the present value of all cash flow from it. It could also be the meeting point between supply and demand curves (or the willingness to pay and marginal cost). Well, there is another factor – human irrationality.

Ariely et al. call it coherent arbitrariness induced by the anchoring effect. In one of their studies, the experimenters selected 55 students of the Sloan School MBA program and tried a bidding game for six products. The experimental design was as follows.

The researchers described six products – wines, chocolates, books and computer accessories. The students need to do the following:
1) Write down the last two digits of their social security (SS) number on top of the paper.
2) Write down the same number (SS) against each item and indicate their choice (as accept/reject) if it was the price of the product in dollars.
3) Write down the maximum willingness to pay for each item.

The results are in the following table. The values with the dollar sign represent the average willingness to pay mentioned by the subjects.

Last 2 digits of SS –> 00-1920-3940-5960-7980-99
Cordless
trackball
$ 8.64$11.82$13.45$21.18$26.18
Cordless
keyboard
$16.09$26.82$29.27$34.55$55.64
Average
wine
$ 8.64$14.45$12.55$15.45$27.91
Rare
wine
$11.73$22.45$18.09$24.55$37.55
Design
book
$12.82$16.18$15.82$19.27$30.00
Belgian
chocolates
$ 9.55$10.64$12.45$13.27$20.64

Look at how the average willingness to pay changed with the anchor (person’s social security number)!

Dan Ariely, George Loewenstein, Drazen Prelec, The Quarterly Journal of Economics, February 2003

Coherent Arbitrariness Read More »

Missing Jimmy Stewart and SVB’s Crisis

We have seen coordination failure and its consequence in bank runs and what might have happened at Silicon Valley Bank last week. Two videos on YouTube (A and B) have prompted me to write this post. Video A is about a CEO who just managed to pull out her money from the bank before the collapse, an event partly transpired by actions such as hers. The second video shows the pivotal moment from the movie, “It’s A Wonderful Life” (1946), where the hero James Stewart single-handedly prevented a bank from collapsing. Real heroism!

Bank’s decision making

So what happened at the 2023 bank scene? SVB held large quantities (in the order of $200B) of deposits from start-up companies. The bank keeps the required minimum cash or fractional reserve banking, typically about 10%, in their vaults; the rest is turned around to make profits (earning from the investing – the interest paid to the depositor). SVB has invested ca. $90B of its cash in what is known as held-to-maturity (bonds). There is nothing wrong so far, as these instruments are pretty risk-free, but not this time! The bank invested its money at ca. 2% return for about four years in 2021, and the Federal Reserve raised the interest rate a year later, making a heavy dent in the current market value of the 2021 investment.

Meanwhile the investors

Two things happened at the investor’s end. The depositors (the technology companies) wanted to take out more money from the bank as the funding started declining for the firms. The news of the declining fair value of the 90 billion bonds became public with the annual report. The second news made the depositors and their seed investors nervous; they wanted to withdraw all their money.

Perfect storm

The end result was a perfect bank run. On March 10, the bank announced they had failed to raise capital and were looking for a buyer. A few hours later, the bank was shut down by the regulator.

The math behind the trouble

Imagine the bank had bought treasury bonds worth $100 in 2021 for four years at a rate of return of 2%, and the Fed raised the interest rate to 5% immediately after that. If the bank waits for four years, it will get 100 x (1.02)4 = 108.2 at 2% returns. If the bank wants to encash before, it must go to the secondary market to sell. The buyer at the secondary market, who can now get 5% returns on a bond, therefore, will value the bank’s bonds at $88 (108/(1.05)4).

The psychology behind the trouble

But the math is just a catalyst to the trouble. The broader issues are the decision-making by the bank that invested significant cash in long-term bonds (duration risk). And the depositors, triggered by their investors, wanted to withdraw their money all at once (irrationality). And alas, the Jimmy Stewarts, who could charm the depositors from carrying extreme actions, exist only in movies and textbooks.

Further Watch

A) CEO describes pulling money from bank hours before collapse: CNN
B) Bank Run Scene from “It’s A Wonderful Life” (1946): Ian Broff
Why Banks Are Collapsing: Graham Stephan

Missing Jimmy Stewart and SVB’s Crisis Read More »

Bank Runs and Pareto Efficiency

Let’s play a new game. Imagine there is a group of people. The players have two choices and payoffs: 1) Invest nothing and get nothing, 2) Invest $100, and there are there two outcomes: if more than 90% of the group invests, there is a net profit of $50, and if fewer than 90% invests, then the investor loses the money (-$100).

There are two Nash equilibria possible here. In the ‘good’ scenario, all invest and get profited. In the other case, no one invests; therefore, nobody loses. If the game is played for the first time, two things can happen: more than 90% invest and get a profit or fail to meet the 90% mark and lose money.

If the game is played many times, and if the players are rational, they will soon realise the basic mentality of the others and converge to one of the two outcomes – nobody invests, or everybody invests.

Bank runs and irrationality

A well-known case of such coordination failure is a bank run. As I write, we are on the cusp of a crisis at SVB (Silicon Valley Bank) in California, a significant start-up lender. So, why do bank runs happen? A bank run occurs when the depositors lose their confidence in the bank and start to withdraw their deposits. It is not a viable proposition as banks do not hold all the money in their vaults but lend or invest most of it to make a profit.

Rational customers with a good memory (of several previous incidents) may decide not to panic and stay invested. But what happens more often is people try to withdraw their money in the rush, only to pull the bank to a potentially avoidable, total failure.

Nash equilibrium: bad fashion and bank runs: YaleCourses

Bank Runs and Pareto Efficiency Read More »

The Girl House

There are two houses. The first house has a boy and another child. In the second house, there is a 10-year-old boy and an infant. Which place has a girl?

Well, it is impossible to predict the house with a girl with 100% certainty. But one of the houses has a higher probability of having a girl than the other.

The first house can have three possible combinations – (boy, boy), (boy, girl) and (girl, boy). So finding one girl in the combinations is 2/3. Since we know the first child is a boy in the second house, there are only two possible combinations – (boy, girl) or (boy, boy). So finding a girl in that house is 1/2.

So choosing house 1 gives a higher probability for a girl.

The Girl House Read More »

Parrando’s Paradox

Let’s play this game: game A) You lose a dollar every time you play one game; game B) you lose five dollars if the money at hand is odd and gain three dollars if it’s even. You have 100 bucks at the start of each game; play for 100 games.

Following are the first few results from game A, followed by the plot of the results.

#Money
at hand
start100
199
298
397
1000

So, you are losing everything in 100 plays. Now, the second game: The excel code is: if(isodd(B1), B1-5, B1+3); assuming the starting 100 is in the cell, B1.

#Money
at hand
start100
1103
298
3101
496
1000

Again, you lose everything in 100.

Play two losing games!

We have played two losing games. Now play game B and game A alternatively and see what happens. if(isodd(A2), if(isodd(B1), B1-5, B1+3), B1-1). The game number is in the A column, starting from A2, and money is in the B column, starting from B1.

#Money
at hand
start100
1103
2102
3105
4104
100200

Putting the outcomes of all three games in one place (back represents game A, red represents game B, and the green represents BABA game:

Where is the paradox?

An important thing to notice here is that game A influences game B (and evades the number from being odd before game B starts). The end result becomes counterintuitive, but not a paradox in the strictest sense.

Parrondo’s paradox: Wiki
The Game You Win By Losing (Parrondo’s Paradox): Vsauce2

Parrando’s Paradox Read More »

Salience Bias

It is a cognitive bias where you focus on certain striking items or information that catch your attention and ignore things that don’t grab the same alert. Salience originates from a contrast between the event and its surroundings. An example is the news of a shark attack on a human – a rare occurrence – that psyches people from going out to the seaside.

Salience bias is critical to be aware of and get under control. In finance, a person’s aspirations to create wealth through long-term investments can derail her reactions to daily market stories about bulls and bears.

Salience Bias: The decision lab

Salience Bias Read More »

Principle of indifference

The principle of indifference is a rule that helps to assign prior probabilities in Baysian-type estimations. It says if there are several alternative possibilities for an event, and there is no particular reason to choose one, the prior – the degree of belief – should be equal among all probabilities. Well, this degree of belief is known as credence.

In the case of coin flipping, the probability that a coin (we don’t know if it is fair or not) lands on the head takes the value, one out of two possibilities, 1/2. Another example is American Roulette and the probability for the ball to land on green (0 or 00). Again, we assign those two prospects equally among 38 pockets, i.e., 2/38 or 1/19.

But if the possibilities partition in different ways, the principle of indifference land in strange situations. See the ‘light switch and ball problem’. There are three balls in an urn – red, blue and green. If I pick a ball at random and it’s red, the light is turned on. If it’s blue or green, the light is off. What is the probability the light is ON?

Well, one can say 1/3 – one in three chances that the ball is red.
One can also say 1/2 because there are two possibilities – the light is ON; the light is OFF!

References

Principle of indifference: Wiki
Principle of Indifference / Insufficient Reason: Statistics How To
The Principle of Indifference: jonathanweisberg.org

Principle of indifference Read More »

Who Knocks Them All? – The Solution

Last time we did match-ups between five individuals and found there is a 31% probability that one player can win all matches. This time, we evaluate it using statistical principles.

What is the probability that a specific individual (e.g., player 1) wins all four matches? It is P(1) = (1/2).(1/2).(1/2).(1/2) = (1/16). It is not difficult to recognise that only one among them gets that chance. If one gets all wins, no one else will (chances of 100% wins are mutually exclusive).

So, the probablity of one player winning all is P(1) + P(2) + P(3) + P(4) + P(5) = 5/16 = 0.312.

Who Knocks Them All? – The Solution Read More »

Who Knocks Them All?

Five players are in a tournament, where each player plays one game with every other. If a player must win every game, what is the probability that one player wins all her matches? The analytical solution is available in the reference below; let’s estimate it numerically.

While the problem is straightforward, I found the coding a bit tricky. Let me explain my logic here step-by-step.

Getting the list of all the matches is perhaps the easiest. Use the function, ‘combinations’ (make sure you installed the library, ‘gtools’).

library(gtools)
play <- combinations(5,2)
play
        [,1] [,2]
 [1,]    1    2
 [2,]    1    3
 [3,]    1    4
 [4,]    1    5
 [5,]    2    3
 [6,]    2    4
 [7,]    2    5
 [8,]    3    4
 [9,]    3    5
[10,]    4    5

Next, take each of these ten combinations and pick a winner with a 50% probability (using the ‘sample’ function).

for (x in 1:10) {
play1 <- c(play[x,])
score[x] <- sample(play1, 1, replace = TRUE, prob = c(1/2,1/2))
}

One such realisation is below:

1 1 1 5 2 2 2 4 5 5

The rest is okay; check a number that repeats four times (four wins). The ‘table’ function can give counts for each number.

table(score)
score
1 2 4 5 
3 3 1 3

The output means (read top to bottom): 1 repeated 3 times, 2 repeated 3 times, 4 repeated 1 time and 5 repeated 3 times.

But we only need the frequency (the ‘times’) four times (if any). For example, in the previous case, if we want to know which number is repeated one time, the following code gives the output.

table(score)[table(score) == 1]
4 
1
length(table(score)[table(score) == 1])
1

The rest is smooth: make a counter for four repetitions, repeat the process a million times and find the average.

itr <- 1000000

win <- replicate(itr,{
  score <- c(1:10)
for (x in 1:10) {
play1 <- c(play[x,])
score[x] <- sample(play1, 1, replace = TRUE, prob = c(1/2,1/2))
}

if(length(table(score)[table(score) == 4]) == 0 ){
  counter <- 0 
  }else{
  counter <- 1  
  }

})


mean(win)

You get 0.312. The analytical solution is 5/16. We will discuss the derivation in the next post.

A probability technique worth knowing: MindYourDecisions

Who Knocks Them All? Read More »