Data & Statistics

NBA Draft – Probabilities

April 20, 2023

Now that you know the probabilities given to the fourteen teams and how the lottery system works, what are the chances that team number 1 gets the lottery?

Getting first

What is the probability of team number 1 (the team with the worst performance in the regular season) getting the lottery in the first draw?

$P_1 = 140/1000 = 0.14$

Getting second

What is the probability that Team 1 gets lucky in the second draw? Well, it is the joint probability that another team (i) obtains the first draw AND team 1 gets the second.

$P_1(2) = P_i(1) * \frac{140}{1000-X} = \frac{X}{1000} * \frac{140}{1000-X}$

Notice X, the number of combinations allocated to team ‘i’ that won the first, will not be available for the second lot, and, therefore, you subtract from 1000. And remember, ‘i’ varies from 2 to 14 (all squads other than Team 1). So, you estimate the joint probability with each of them and add them up. Rearrange terms and sum it over ‘i’,

$P_1(2) =\sum\limits_{i=2}^{14} \frac{140}{1000} * \frac{X}{1000-X} = \sum\limits_{i=2}^{14} P_1 \frac{P_i}{1-P_i} = P_1 \sum\limits_{i=2}^{14} \frac{P_i}{1-P_i}$

Let’s estimate the value using the R code:

prob_value <- c(0.14, 0.14, 0.14, 0.125, 0.105, 0.09, 0.075, 0.06, 0.045, 0.03, 0.02, 0.015, 0.01, 0.005)
prob.sum = 0
for(i in 2:14){
    current = prob_value[i]/(1-prob_value[i])
    prob.sum = prob.sum + current
}
prob.sum*prob_value[1]

The answer is 0.1341732. The probability of team 1 getting lucky in the first two draws = 0.14 + 0.1341732 = 27%

Getting third

Extending the same logic, the probability for Team 1 to get the third lot is

$\\ P_1(3) = P_i(1) * P_j(2) * \frac{140}{1000-X-Y} = \frac{X}{1000} * \frac{Y}{1000-X} * \frac{140}{1000-X-Y} \\ \\ P_1(3) = \mathop{\sum\sum} \frac{140}{1000} * \frac{X}{1000-X} * \frac{Y}{1000-X-Y} \\ \\ P_1(3) = \mathop{\sum\sum}\limits_{i\neq j \neq 1} P_1 * \frac{P_i}{1-P_i} * \frac{P_j}{1-P_i-P_j}$

prob_value  <- c(0.14, 0.14, 0.14, 0.125, 0.105, 0.09, 0.075, 0.06, 0.045, 0.03, 0.02, 0.015, 0.01, 0.005)
prob.sum = 0
for(i in 2:14){
  for(j in 2:14){
    if(i != j){
          prob.sum =  prob.sum + prob_value[i]*prob_value[j]/ ((1-prob_value[i]) * (1-prob_value[i]-prob_value[j]))
      }
     }  
  }
prob.sum*prob_value[1]

0.1274865. So, for team 1 getting in the first three draws = 0.14 + 0.1341732 + 0.1274865 = 40 %

Getting fourth (final)

$\\ P_1(4) = P_i(1) * P_j(2) * P_k(3) \frac{140}{1000-X-Y-Z} = \frac{X}{1000} * \frac{Y}{1000-X} * \frac{Z}{1000-X-Y} * \frac{140}{1000-X-Y-Z}\\ \\ P_1(4) = \mathop{\sum\sum\sum}\limits_{i \neq j \neq k \neq 1} P_1 * \frac{P_i}{1-P_i} * \frac{P_j}{1-P_i-P_j} \frac{P_k}{1-P_i-P_j-P_k}$

prob_value <- c(0.14, 0.14, 0.14, 0.125, 0.105, 0.09, 0.075, 0.06, 0.045, 0.03, 0.02, 0.015, 0.01, 0.005)
prob.sum = 0
for(i in 2:14){
  for(j in 2:14){
    for(k in 2:14){
          if(j != k){
            if(i != j){
              if(i != k){
    current = (prob_value[i]/(1-prob_value[i])) * (prob_value[j]/(1-prob_value[i]-prob_value[j])) * (prob_value[k]/(1-prob_value[i]-prob_value[j]-prob_value[k]))
    prob.sum = prob.sum + current              
              }
  
          }
    }
      }  
  }
}
prob.sum*prob_value[1]

0.1197205. So, for team 1 getting one of the lotteries = 0.14 + 0.1341732 + 0.1274865 + 0.1197205 = 52.1 %

NBA Draft – Probabilities Read More »

NBA Draft – The Lottery

April 19, 2023

Now, the lottery. Fourteen ping pong balls – numbered 1 through 14 – are placed in a glass drum and well mixed for 20 seconds. Four balls are collected from the mix at ten second-intervals. So why 14 balls? The clue: it is not related to the 14 teams that are participating in the draw! But it has to do with statistics – Four balls, picked randomly from fourteen balls, gives a total combination of about 1000.

$_{14}C_4 = \frac{14!}{4! 10!} = \frac{14 * 13 * 12 * 11}{4*3*2} = 77 * 13 = 1001$

Each team gets a list of four-ball combinations (the look-up table) based on their assigned probability, as seen in the earlier post. E.g., Pistons, Rockets and Spurs get 140 numbers, the Hornets receive 125, Pelicans 5 etc.

Here is the list of the first 140 numbers generated by the R command, combinations(14,4)[1:140,]. It is just an illustration, and I am not sure if NBA assigns numbers in this order.

Once the team that qualified for the first draft is determined, the lottery is repeated. If a combination corresponding to the already selected team comes up in the draw again, the machine will be reset for another try.

References

2022 NBA Draft Lottery Presented By State Farm: NBA
The room where it happens: Behind the scenes at the NBA Draft lottery: The Athletic
How NBA Draft Lottery Probabilities Are Constructed: Squared Statistics
Nervous energy, phone withdrawal and a waiting period: Inside the 2022 NBA Draft Lottery drawing room

NBA Draft – The Lottery Read More »

NBA Draft

April 18, 2023

Each year, the NBA teams recruit the best available talents – from colleges in the US or from overseas – through a process known as the draft. A draft order determines which team can choose first, second etc. Let’s divide the recruitment process into four stages.

Order teams

The 30 teams in the NBA are ordered in the reverse order of regular season record – the worst takes place 1, and the best gets place 30.

Draft Lottery

The top 14 of the previous list (remember: the worst 14 of the regular season) are eligible for the draft lottery. These 14 teams are the ones who miss out on the playoffs, i.e., 30 total – 16 playoffs = 14 remaining. The lottery is only to determine the top 4 picks. The 14 teams get probabilities of winning the lotteries based on where in the list they are.

Team	Probability
1 _{(the last)}	14.0%
2	14.0%
3	14.0%
4	12.5%
5	10.5%
6	9.0%
7	7.5%
8	6.0%
9	4.5%
10	3.0%
11	2.0%
12	1.5%
13	1.0%
14	0.5%

For example, here is the list for 2023 with the names and their respective probabilities.

Team	Win	Probability
Pistons	.207	14.0%
Rockets	.268	14.0%
Spurs	.268	14.0%
Hornets	.329	12.5%
Trailblazers	.402	10.5%
Magic	.415	9.0%
Pacers	.427	6.8%
Wizards	.427	6.7%
Jazz	.451	4.5%
Mavs	.463	3.0%
Bulls	.488	1.8%
OKC	.488	1.7%
Raptors	.500	1.0%
Pelicans	.512	0.5%

You may notice a slight variation in the chances. That happens whenever two or more teams tie (same win%); the probabilities are added and divided equally (if the sum is odd, the division hands a slight advantage to one of the teams).

Picks 5 – 14 and 15-30

The 11 teams that missed out on the four lottery picks will get to pick players as per their order in the list. It means the No. 1 team on the list, if that misses all four lots, is guaranteed the No. 5 spot. The same goes for the 16 playoff teams – they get to choose 16 candidates (as No. 15-30). This concludes round 1.

Picks 31-60

The entire second round (No. 31-60), is determined by reverse order of regular season record.

The statistics of the lottery is in the next post

References

How NBA Draft Lottery Probabilities Are Constructed: Squared S t a t i s t i c s
NBA draft lottery: Wiki
NBA Draft Lottery: Odds, history and how it works: NBA
Tanking Won’t Die in the New NBA Draft Lottery System. It Will Only Evolve: The Ringer

NBA Draft Read More »

Republican Bayes

April 17, 2023

Let’s answer this question. In the Pew Research Center poll results published in 2010, 53% of Republicans, 14% of Democrats and 31% of Independents answered NO to the question, is there solid evidence that the earth is warming?
If a respondent answered no, what is the probability that she is a Republican? Note that on this survey on Oct 13-18, 2010, 25% of the participants were Republicans, 31% were Democrats, and 40% were Independent.

Let’s use the general formula of Bayes’ theorem here:

$\\ P(j|N) = \frac{P(N|j)*P(j)}{\sum\limits_{i = 1}^{n} P(N|i)*P(i)}$

Here, j represents Republican, and ‘i‘ represents a Republican, Democrat or Independent. So the required probability that a person is a Republican, given that she answered NO, is:

$P(R|N) = \frac{P(N|R)*P(R)}{P(N|R)*P(R) + P(N|D)*P(D) + P(N|I)*P(I)} \\\\ \frac{0.53*0.25}{0.53*0.25 + 0.14*0.31 + 0.31*0.4} = 0.44$

So, there is a 44% chance that the random person is a Republican: no better than flipping a coin!

Increasing Partisan Divide on Energy Policies: Pew Research

Republican Bayes Read More »

Post-Season Begins – Part II

April 15, 2023

Before we conclude and are ready for the post-season, here are a few more statistics on how they played in the regular season.

Post-Season Begins – Part II Read More »

Post Season Begins

April 14, 2023

So, the NBA postseason 2023 starts in a couple of days. Let’s look at how the teams performed in the regular season.

n_data <- read.csv("./nba23.csv")

Most and least win

win_data <- n_data[order(-n_data$W),c(1,3)]
as_tibble(win_data)

Here are the top 10 and bottom ten

Most and least points per game

win_data_top <- n_data[order(-n_data$PTS),c(1,7)]
as_tibble(win_data_top)

win_data_bot <- n_data[order(n_data$PTS),c(1,7)]
as_tibble(win_data_bot)

Post Season Begins Read More »

Accuracy and Asymmetry

April 13, 2023

Let’s develop a simple prediction technique to identify the sex of a person based on height. Here is data from 1050 participants and has the following form.

The first step is to plot them and check their distributions.

A naive way to set up the prediction is to assign everyone with height > 64 inches as male.

y_hat <- ifelse(heights$height > 64, "Male", "Female") 
mean(heights$sex == y_hat)

The answer is an impressive 83%

But how well did it predict individually?

mean(yy[heights$sex == "Male"] == y_hat[heights$sex == "Male"])
mean(yy[heights$sex == "Female"] == y_hat[heights$sex == "Female"])

For males, the accuracy is about 94% and for females, it’s only 44%. The discrepancy prompts us to look at the respective number of samples in the set.

length(heights$sex[heights$sex == "Female"])
length(heights$sex[heights$sex == "Male"])

Females are 238, and males are 812.

Accuracy and Asymmetry Read More »

Singing Competition

April 10, 2023

Ana, Becky and Claire are three singers entering a contest. Ana has won 4% of past competitions, Becky has 5%, and Claire has 6%. If Ana has submitted 7 albums, Becky 2 and Claire 3, what is the probability that Ana will win this time?

The general formula of Bayes’ theorem is:

$\\ P(j|W) = \frac{P(W|j)*P(j)}{\sum\limits_{i = 1}^{n} P(W|i)*P(i)} \\ \\ \frac{P(W|j)*P(j)}{P(W|1)*P(1) + P(W|2)*P(2) + P(W|3)*P(3)}$

In the present case, for Ana, it is:

$\\ P(Ana|W) = \frac{P(W|Ana)*P(Ana)}{P(W|Ana)*P(Ana) + P(W|Becky)*P(Becky) + P(W|Claire)*P(Claire)} \\\\ \frac{0.04 *(7/12)}{0.04 *(7/12) + 0.05 *(2/12) + 0.06 *(3/12)} = 0.5$

50% chance!

Singing Competition Read More »

The Dropout Fallacy

April 9, 2023

Why do people go to college? To some, it is to learn. To academics and philosophers, it is more than just learning to enrich intellectual and social capital in individuals. But to many, a college education prepares them to get a job.

And there is nothing wrong with that thought – there is a strong positive correlation between jobs and education. Here is data from the U.S. Bureau of Labor Statistics:

Degree	Median Salary (USD)	Unemployment Rate (%)
Doctoral	1909	1.5
Professional	1924	1.8
Master’s	1574	2.6
Bachlor’s	1334	3.5
Associate’s	963	4.6
College, no degree	899	5.5
High School	809	6.2
Less than High School	626	8.3

_{Unemployment Rates and Earnings by educational attainment, 2021}
_{^Note: ^{Data are for persons aged 25 and over. Earnings are for full-time wage and salary workers.}}^{Source: Current Population Survey, U.S. Department of Labor, U.S. Bureau of Labor Statistics}

But only until you encounter the superheroes – the Gates, the Dells and the Jobs – the college dropouts! The countless stories and speeches reinforce the theme that dropouts counterbalance their short-coming in education through their determination, superior intelligence and perseverance.

There can be a lot of factors behind the observation of successful dropouts. Foremost among them is randomness: out of the millions that have a chance but fail to complete their college, a negligible few happen to become millionaires. And they get more airtime in public. In that respect, the dropout fallacy is a survivorship bias.

The second invisible factor is related to the confounding effect of the social and cultural capital of the prosperous – such as access to a network of successful people, easy access to financing their ventures etc.

References

Unemployment Rates and Earnings by educational attainment, 2021: U.S. BLS
The Myth of the Successful College Dropout: The Atlantic

The Dropout Fallacy Read More »

Free and Unlimited

April 8, 2023

The concept of FREE! is one of the most compelling forces on human irrationality. Based on many examples, it has been proven that the market power of FREE! is not an extrapolation of discount.

In a famous experiment by Shampanier et al., the researchers offered to the participants a choice between Hershey’s (low-value chocolate) and Lindt truffle (high-value) for three different price offers – (0&14), (1&15) and (0&10). The first number inside the bracket refers to the price of Hershey’s in cents, and the second is that of Lindt. And the results showed the demand for Lindt dropped from 36% to below 20% in both the FREE! options and that of Hershey’s went up from 14% to 40%. Note that 40-50% of the participants opted for nothing.

In the real world, the appeal to free and unlimited has been hailed as a blockbuster success story behind India’s Jio telecom company. When it was launched for the public in September 2016, Jio SIM cards were available for free, along with 4GB of data a day, for three months. And the results? The Indian telecom industry, which had six players at that time, was reduced to four, and Jio captured about 350 million subscribers today.

Dan Ariely, Predictably Irrational

Shampanier et al., Zero as a Special Price: The True Value of Free Products, Marketing Science, 26 (6), 2007, 742

Free and Unlimited Read More »