We have seen and used them before. But let’s refresh a few basic statistical parameters once again. The mean time between failures (MTBF) of an instrument (in weeks) is as per the following table. Calculate the key parameters to summarise the performance.
22 | 32 | 44 | 2 | 90 | 56 | 20 |
3 | 93 | 29 | 32 | 28 | 12 | 38 |
75 | 69 | 37 | 61 | 54 | 38 | 79 |
15 | 45 | 12 | 15 | 62 | 49 | 50 |
88 | 74 | 44 | 38 | 69 | 57 | 51 |
15 | 69 | 16 | 48 | 44 | 72 | 52 |
72 | 26 | 9 | 19 | 73 | 54 | 50 |
There are 49 data points in total. We will estimate the mean, median, mode, and time for 10% (P10) and 90% (P90) to fail.
Central Tendency
Mean, median and mode give the central tendency of the data. The mean is the average of the data. Sum all the numbers and divide by the total number (49).
#The R code is
machine <- c(22, 32, 44, 2, 90, 56, 20, 3, 93, 29, 32, 28, 12, 38, 75, 69, 37, 61, 54, 38, 79, 15, 45, 12, 15, 62, 49, 50, 88, 74, 44, 38, 69, 57, 51, 15, 69, 16, 48, 44, 72, 52, 72, 26, 9, 19, 73, 54, 50)
mean(machine)
The median represents the mid-value of the data, i.e. 50% of the observations are below the median, and 50% are above. Let us rewrite the table in ascending order. The median is the value at the position (n+1)/2 if n is odd, and if n is even, it is the average between (n/2)th and (n+2)/2th. Since the number of observations is 49 (odd), the median is the 25th element, 45, which is highlighted in bold.
2 | 15 | 29 | 44 | 50 | 61 | 73 |
3 | 16 | 32 | 44 | 51 | 62 | 74 |
9 | 19 | 32 | 44 | 52 | 69 | 75 |
12 | 20 | 37 | 45 | 54 | 69 | 79 |
12 | 22 | 38 | 48 | 54 | 69 | 88 |
15 | 26 | 38 | 49 | 56 | 72 | 90 |
15 | 28 | 38 | 50 | 57 | 72 | 93 |
median(machine)
Mode is the most frequently occurring value(s) in the set. In our case, 15, 38, 44 and 69 occur the maximum (3 times). Since there is no in-built function for mode in R, we create one.
stat_mode <- function(x) {
ux <- unique(x)
tab <- tabulate(match(x, ux))
ux[tab == max(tab)]
}
stat_mode(machine)