November 2021

The Central Limit Theorem

The Central Limit Theorem (CLT). It has intrigued me for a long time. The theorem concerns independent random variables, but it is not about the distribution of random variables. We know that a plot of independent random variables will be everywhere and should not possess any specific pattern. The central limit theorem is about the distribution of their sums. Remember this.

Let us take banks and defaulters to prove this point. Suppose a bank gives away 2000 loans. The bank knows that about 2.5% of the borrowers could default but does not know who those 50 individuals are! That means the defaulters are random. They are also independent. These are two highly debatable notions; once in a blue moon, these assumptions will prove to be the bank’s end. But we’ll deal with it later.

So, what is the distribution of losses to this bank due to defaults? Before that, why is it a distribution and not a fixed number, say, 50 times the loss per foreclosure? Or if the loss per foreclosure is 100,000 per loan, the total loss is 50 x 100,000 = 5 million. A fixed number. That is because a 2.5% default rate is a probability of defaulting, not a certainty. If it is a probability, the total loss to the bank is not a fixed amount but a set of random numbers.

Let’s disburse 2000 loans to people and collect data from 10,000 banks worldwide! How do we do it? By Monte Carlo simulations. The outcome is given below as a plot.

This is the Central Limit Theorem! To put it in words, if we take a large number of samples from a population, and these samples are taken independently from each other, then the distribution of the sample sums (or the sample averages) follows a normal distribution.

The Central Limit Theorem Read More »

The drama of Breaking Equilibrium

We all know O. Henry’s timeless classic, The Gift of the Magi. It is about a young husband (Jim) and Wife (Della). They were too poor to buy a decent Christmas gift for each other. Finally, Della decides to sell her beautiful hair for 20 dollars and buys a gold watch chain for Jim. When Jim comes home for dinner, Della tells the story and shows the gift, only to find a puzzled Jim and finds out that he sold his watch to buy the combs with jewels as a surprise gift for his beloved’s hair!

What would be a rational analysis of the decisions made by the couple in the story? First, draw the payoff matrix. There are four options: 1) Dell and Jim keep what they have, 2) Dell sells hair, Jim keeps his watch, 3) Dell keeps her hair, Jim sells his watch and 4) they both sell their belongings. The payoffs are 

DellaDella
Not Sell HairSell Hair
JimNot Sell WatchD = 0, J = 0D = 5, J = 10
JimSell Watch D = 10, J = 5D = -10, J = -10

When both decide to have no gifts for Christmas, they maintain the status quo with zero payoffs. One of them selling their belonging to buy a gift that the other person dearly wished for brings happiness to the receiving person (+10) and satisfaction to the giver (+5). Eventually, when they lose their belongings, resulting in no material gain for any of them, they both are on negative payoffs. Their sacrifice was in vain!

The author chose an ending called a coordination failure in the language of game theory. It is an outcome that is out-of-equilibrium. For a short-story writer, this brings drama to his readers and conveys the value of sacrifice. In the eyes of millions of readers, the couple’s payoffs were infinite.

The Gift of Magi by O. Henry

The drama of Breaking Equilibrium Read More »

Colorectal Cancer and Meat-Eaters

In 2015, the World Health Organisation (WHO) added processed meat to group 1 carcinogens (carcinogenic to humans). It was based on a report published in October 2015 by the International Agency for Research on Cancer (IARC). The list contains, among others, tobacco, gamma rays, benzene, and asbestos!

IARC report scanned through scientific literature and concluded that there was evidence that processed meat could cause cancer in humans. The experts concluded that each 50-gram portion of processed meat eaten daily increases the risk of colorectal cancer by 18%.

Lost in Statistics?

As usual, the media went overboard with the news, some of them to the extent:

“BACON, HAM, SAUSAGES HAVE SAME CANCER RISK AS CIGARETTES: WHO REPORT” (First Post)”

Is it true that processed meat is as dangerous as smoking cigarettes? Are all items on the list have the same risk? Does 18% cancer risk mean 1 in 6 of the meat-eaters die of colorectal cancer?

Absolute and Relative Risks

The above was a classic case of people misinterpreting relative and absolute risks. 18% in the present case represented a relative risk – an increase of risk compared to the risk of getting colorectal cancer among non-processed meat-eaters. To understand relative risk, you first need to know the base or the absolute risk on which it was calculated. And if it is a low base, expect a high percentage for every unit change, like the GDP growth rates of smaller developing economies vs big, well-developed ones.

So, what is the absolute risk or the proportion of patients in the population? As per the American Cancer Society, the lifetime risk of developing colorectal cancer is about 1 in 23 (4.3%) for men and 1 in 25 (4.0%) for women. For simplicity, let’s take 5%; 18% of 5% is 0.009% or about 1 in 100. The bottom line is:

5 in 100 people can get colorectal cancer in the US, and if all of them start eating 50 g of processed meat every day, the risk increases to an additional person!

This also answers the remaining doubt on the group 1 list: Not all carcinogens in the WHO list have the same risk.

IARC Report on Processed Meat

Known Carcinogens: Cancer.org

Carcinogenicity of Processed Meat: The Lancet Oncology

How common is colorectal cancer: cancer.org

Colorectal Cancer and Meat-Eaters Read More »

Game of Chickens

Two teenagers are driving towards each other on a straight road, with an apparent show of courage to prove who can stay longer before turning off. Each teen’s expectation is to stay straight for the longest time and force the other person to swerve. The winner shines as the rebel, and the loser becomes the chicken!

Assume A and B are playing, and you can imagine four possible outcomes. 1) A chickens, 2) B chickens, 3) both chickens and 4) they collide head-on (and possibly die!). What are their payoffs? Let’s write down some, based on assumed reasons why they play this game in the first place – teen energy, naivety, happiness, pride, girls (the stereotypes, you see).

Player A
Player A StaysPlayer A Chickens
Player BPlayer B StaysA =-INF; B = -INF A = -100; B = +100
Player B ChickensA = +100; B = -100A = 0; B = 0

If player A stays and player B chickens, A gets +100, mainly in happiness, pride, etc., whereas B gets -100 (in shame!). The exact opposite happens when the fortunes are reversed.

Let’s understand the chances from player A’s point of view. If Player B stays, A can either stay (-INF) or turn away (-100), turning off and giving a better payoff. If B turns away, A can stay (+100) or move off (0). Unlike the case with the prisoner’s dilemma, the choice for A is not unique.

Given all the possibilities, what is an optimum strategy for both players? Both are courageous and stubborn. Assume player A knows B and also knows player B knows player A. It means they both try for maximum returns, but continuing the status quo will be fatal. So, there must be an exit strategy each of them must hold– to swerve away from the other, but at the last possible moment. 

In my opinion, the best option that minimises the shame and, at the same time, prevents death is when both players turn off a second before the crash!

Game of Chickens Read More »

Predict the Number – Thaler Experiment

In 1997, American behavioural economist Richard Thaler asked the readers of the Financial Times to submit a number between 0 and 100 so that the person whose number was the closest to 2/3 of the average of all numbers would be the winner. What will be your answer to this question as a rational decision-maker?

Your first step is to eliminate the obvious. The highest possible average from 0 to 100 is 100. It happens when everybody submits the number 100. It would mean the answer to the problem is (2/3) of 100 = 67. So, any number above 67 as a submission is not a rational choice.

You can’t stop there. Once you find that the rational choice for the highest number was 67, this number becomes the new highest average, and the (2/3) is 45! This iterative reasoning continues until you reach zero!

What could be an intuitive answer to this problem? Here, you assume people can randomly guess between 0 and 100, and the average is 50. (2/3) of 50 is 33. If you stop after stage 1, you submit 33 as the answer. If you continue for another round, based on the understanding that the average choice of the crowd is 33, the winner choice is 22. The number becomes 15 in the next stage and ends with 0.

So, the rational answer is 0. However, the average obtained in the actual experiment in Financial Times turned out to be 18; therefore, the winner was the one who submitted 12. The leading choices of the readers were 1, 22 and 33! When he repeated the game later, the average was 17.3, and the leaders were 1, 0 and 22.

Thaler Experiment: Financial Times

Predict the Number – Thaler Experiment Read More »

Rational Thinking and Prisoner’s dilemma

The prisoner’s dilemma is a much-discussed subject in game theory. Police arrested two individuals for their involvement in some criminal activities and put them in prison. They have adequate evidence to frame charges and hand them two years of imprisonment but not for a maximum of ten years.

Police approach a prisoner and make an offer in return to testify against the other person. If she betrays the other and the other person remains silent, she can go free. If she keeps quiet and the other person gives evidence against her, she gets the maximum punishment of 10 years. If they both remain silent, the existing term of two years continues. If they both testify against each other, they both get five years.

Imagine A is a rational decision-maker, and she assumes that a similar offer may also have gone to prisoner B. She starts from the point of view of the other person before deciding on her own. Person B has two options: remain silent or betray person A. If B remains silent, A can remain silent (2 years) or cross B (0 years). Betray B is currently the better of the two. If B testifies, A can remain silent (10 years) or betray B (5 years). Betray B is the better one here again. In other words, A has no option but to give evidence against B.

Cooperation vs Competition

Decision-making such as this starts with knowing the potential strategies of the other. Once sorted out, the player will opt for the option that protects her, irrespective of the other’s choice. 

A rational decision may not be the decision that gives the maximum payoff. In the present story, cooperation might appear as that option, where each serves two years in jail. But it was not a cooperative game, where both the parties trust each other and form a joint strategy – to remain silent. Therefore, it is not the optimal option in cases where the players compete against each other.

Cold War and Nuclear Build-Up

The Nuclear build-up between the USSR and the USA during the Cold War period is an example of a prisoner’s dilemma in real life. From the viewpoint of the USA or the USSR, the rational (strategic) option was to pile up more nuclear warheads instead of reducing them, although one has every right to argue that the latter could have been the better choice for humankind.

Rational Thinking and Prisoner’s dilemma Read More »

More Casino Games

In one of my previous posts, we have seen theoretical considerations of gambling with the roulette wheel. Today, we play the game not by going to a casino but by sitting behind a computer, using a simulation technique known as the Monte Carlo method.

Here, we let 1000 people play 100 games of each red/black betting. Starts with the players first:

Approximately 250 people (25%) have won some money. Also, you don’t see the perfect distribution that you would have expected in a Bernoulli trial. This is because it is not a theoretical estimation but a real game using random numbers.

The Game is for Casino

The rest of the post is about the casino owner. We know that the odds of a gambler winning colour is (18/38), and the odds of losing is (20/38) – about five cents per dollar net going the casino’s way. You will see from the picture that as the number of games increases, the certainty of getting 5 cents per dollar increases. It is sometimes known as the law of large numbers, where the actual value converges to the expected value as the number of events increases.

The next plot summarises the net money the casino earned in the day. Unless a few people turn up and play a really small number of games, the company would make what is expected: 5 cents per every dollar of value.

Now, let 2000 people play fewer (50) games each.

About 600 people out of 2000 (30%) may have won some money. As far as the Casino owner is concerned, he is pretty happy!

More Casino Games Read More »

Cancer Happens

This one is going to disappoint some of you. What causes cancer? The answer is – life. Cancer happens; well, most of the time!

Primary reasons for cancer in humans are classified into three categories: environmental (E), hereditary (H), and mistakes during DNA replications (R).

Researchers at Johns Hopkins University evaluated cancer incidence in 69 countries and found correlations between cancer risks and these factors. Before going into details, please see the picture that I copied from Tomasetti’s paper (Tomasetti et al., Science. 2017 March 24; 355(6331): 1330–1334. doi:10.1126/science.aaf9011.).

First, a primer on what I meant by the replication factor, R. Approximately three mutations occur every time a stem cell divides. Most of these are inconsequential to us, but occasionally, they cause trouble. What is so special about stem cells? Stem cells are the body’s prime cells that give birth to cells with specialized functions – the blood cells, brain cells, heart muscle cells or bone cells.

Leading environmental factors known to cause cancer in humans include UV from sunlight, tobacco, soot, asbestos, carcinogenic chemicals, and ionising radiation.

Randomness, Again

These results also partly explain the observed stochastic nature of the disease. Remember, “my granny had cancer without smoking, and my uncle still smoking healthy”, all that stuff! Now you know the reasons for the deadly outcome are many – some you know already, some don’t, and perhaps never will.

Not an Either Or

Results from the study also point to the human tendency to rush to wrong conclusions, similar to a deductive fallacy. Environmental reasons are responsible for some cancer types, but it does not mean all cancers are due to Environment. To be precise, two in three are not! Does it mean you ignore environmental factors, smoke, eat tobacco, and give up sunscreen? Quite the opposite. One must continue avoiding exposure to carcinogens as they are the levers to manage those individual probabilities that are within your control, which eventually leads to a reduction in the combined chances of getting the disease (remember the AND rule of probability?). You thus avoided the disease, at least for a while!

The last takeaway of the study, which showed Pearson’s linear correlation of 0.804 between total stem cell divisions and lifetime cancer risk, leads to an unwanted prize for achieving higher life expectancies – the more you live, the more your chance of dying of cancer!

Science. 2015 January 2; 347(6217): PubMed Link

Science. 2017 March 24; 355(6331): PubMed Link

Stem Cells: Mayo Clinic

Cancer Happens Read More »

Appreciating Stochastic Processes

We all understand deterministic processes, where the outcome of an action is definite and predictable. You touch a hotplate, and it hurts, possibly a blister by the next day! Press the pedal, and the car goes faster. Certainty is nice and visible, a cause and an effect; decision-making is easy. Personal stories reinforce our appreciation for deterministic processes. The brain is wired for determinism.

artificial intelligence, brain, think-3382521.jpg

On the other hand, stochastic processes are not straightforward and require deliberate training to understand. Doctors say smoking causes cancer, yet we don’t see all smokers dying of cancer. To make matters worse, some non-smokers suffer lung cancer!

When Reasons are Many, Output is a Chance

It is the randomness of input that governs stochastic processes. The output becomes a set of probabilities. Be it weather predictions or movements in the stock market. Climate scientists use the best of their physics and thermodynamics to forecast the weather using the available data on wind speed, temperature, humidity, and pressure. Even small uncertainties in those variables can result in large ranges in predictions. Some of them may be random, others we never understand.

Extreme cases are the black swan events. Here, an event has a tiny chance of occurring but creates unimaginable consequences. Is anything better than the COVID-19 pandemic and its impact on the global economy?

Appreciating Stochastic Processes Read More »

Survivors of Russian Roulette

This post is inspired by the famous book Fooled by Randomness by Nassim Nicholas Taleb.

Imagine a person who wants to play Russian Roulette. It is a game in which the referee (or the executioner?) takes a revolver containing one bullet in one of its six chambers, spins the cylinder, points to the head and pulls the trigger. If you survive, you get a prize – 4 million dollars.

Alive at Age 50

As seen in my previous posts, one can determine the person’s survival chance is 5 in 6 (83%) in one game. This person decides to play this game once a year, starting at age 25. What is the probability that she will become a 100-million-dollar net-worth individual (NWI) by 50? Use the Bernoulli trial that we had discussed in the previous post, and we get a survival chance of about 1% after 25 games [25C25 (5/6)25(1/6)0]. The odds to earn 100 million this way are, indeed, small; no two opinions but to acknowledge her exceptional luck!

Let me complicate the plot: imagine 1000 individuals started playing this game in different parts of the world (different venues, referees, different TV sponsors!). There is a definite possibility of about ten winners (give or take a few) after the 25th season of this deadly game.

A Superhero is Born

Suddenly, these superstars are on the covers of Fortune, in popular TV shows, and parents of young children start pressuring their kids to learn this game. Spiritual gurus proclaim their remarkable moral habits; TV anchors interview their grandmothers; data analysts flood YouTube, fitting their BMI to eating habits to academics with their achievements. Ladies and gentlemen, I am presenting you this evening: the superhero of all fallacies, the Survivorship Bias. It is a selection bias in which reasoning is made by considering only the survivors’ data and not those that have already ceased to exist.

Survivorship bias exists everywhere, far more than what you think. The superstars of the stock market were Taleb’s favourite example. Consistent longer-term performances of fund managers have been the subject of many studies. More often than not, they were no better than Roulette gamers. Then comes the band of ultra-rich business leaders – risk-takers, college dropouts, lonely, full of grand ideas …

Another example is our obsession with the past. You must have heard about extraordinary claims of how prosperous, healthy and long-living our ancestors used to be when living ‘close to nature’ – all these when the average life expectancy was just in the 20s! Of course, the author who wrote the stories included only those who survived their adolescence AND showed some amazing ‘acts of valour’. Try calculating the joint probability of the following: chance of surviving adolescence x having some remarkable skills x being found by the author x getting the king’s approval to include in the book.

Dice Probability Calculator

Human Life Expectancy PNAS

Mistakes due to Survivorship Bias: BBC

Survivors of Russian Roulette Read More »