PCA of NBA Players

Let’s now move to NBA. Following is the PCA biplot of the ESPN top 40 NBA players of the regular season 2022-23.

We can see a few things:
1) Damian Lillard and Steph Curry are in a cluster which is closer to the vector 3PM (three points made)
2) A few centres are closer to each other, and the vector BLKPG (blocks per game) is closer to them.
3) Jokic and Giannis are placed somewhere far away.
4) APG (assists per game) and TOPG (turnover per game) are similar contributions (negative) to the principal component 2. The leaders, Harden, Haliburton and Young, are closer to the APG vector.
5) Centres and power forwards dominate the right side of principal component 1, whereas the guards take the left.

We see 3PM and FG% (field goal percentages) diametrically opposite to each other, suggesting they are negatively correlated.

And, if you are wondering who they are:

The data are taken from the ESPN site using the following R code:

library(rvest)
nba_23 <- read_html("https://www.espn.com/nba/seasonleaders")
nba_23 <- nba_23 %>% html_table(fill = TRUE)

Followed by a few clean-up steps

nba_data <- as.data.frame(nba_23)
names(nba_data) <- nba_data[2,]
nba_data <- nba_data[-1:-2,]
index <- which(nba_data$PLAYER == "PLAYER")
nba_data <- nba_data[-index,]
nba_data <- nba_data %>% mutate_at(vars(GP, MPG, `FG%`, `FT%`, `3PM`, RPG, APG, STPG, BLKPG, TOPG, PTS), as.numeric)

References

2022-2023 NBA Season Leaders: ESPN