ROC and AUC

We’ll demonstrate the concept of ROC (Receiver Operating Characteristics) and AUC (Area Under Curve) with the help of (simulated) weight data and using R codes. Here are the first ten rows of the data.

weight       obese
86.48505	0			
88.04764	0			
111.50064	0			
112.69730	0			
121.53974	0			
122.34533	0			
126.53330	0			
129.34565	0			
129.46268	1			
130.17398	1

Here ‘obese’ is the outcome variable that takes one of the two values, 0 (not obese) or 1 (obese). The ‘weight’ is the independent variable, also known as the predictor.

Now, we’ll do logistic regression of the data using the generalised linear model (‘glm’), store the output in a variable and plot.

plot(weight, obese, col = "blue", cex  = 1.5,  cex.axis = 1.5, cex.lab = 1.6)
glm.fit <- glm(obese ~ weight, family = "binomial")
lines(weight, glm.fit$fitted.values, lwd = 3)

Estimation of ROC and AUM requires the package, ‘pROC’.

par(bg = "antiquewhite1", pty = "s")
roc(obese, glm.fit$fitted.values, plot = TRUE, legacy.axes = TRUE, col = "brown", lwd = 3, print.auc = TRUE, auc.polygon = TRUE)

We used the following options to get the final plot.
par(pty = “s”); for plotting the graph as a square
plot = TRUE; for plotting the graph
legacy.axes = TRUE; for plotting 1- specificity on the x-axis instead of the default specificity
print.auc = TRUE; to print the value of AUC on the graph
auc.polygon = TRUE; to present AUC as a shaded area.