Durbin-Watson Test

We have seen examples of regression where the basic assumption of uncorrelated residuals is compromised. Finding the autocorrelation of the residuals using Durbin–Watson is one way to diagnose the correlation. Here, we perform a step-by-step estimation.

Step 1: Plot the data

plot(Nif_data$Year, Nif_data$Index, xlab = "Year", ylab = "Index")

Step 2: Develop a regression model

fit <- lm(Index ~ Year, data=Nif_data)
summary(fit)
Call:
lm(formula = Index ~ Year, data = Nif_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-3410.3  -544.5   -96.5   507.6  5603.0 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -438.128     27.801  -15.76   <2e-16 ***
Year         566.726      2.243  252.65   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1020 on 5352 degrees of freedom
Multiple R-squared:  0.9226,	Adjusted R-squared:  0.9226 
F-statistic: 6.383e+04 on 1 and 5352 DF,  p-value: < 2.2e-16

Step 3: Estimate Residuals

Nif_data$resid <- resid(fit)

Step 4: Durbin–Watson (D-W) Statistics

D-W statistic is the sum of differences between successive residuals squared divided by the sum of residuals squared.

D-W Statistics = sum (ei - ei-1)2 / sum(ei2)
sum(diff(Nif_data$resid)^2) / sum(Nif_data$resid^2)
0.006301032

R can do better – using the ‘durbinWatsonTest’ function from the library ‘car’.

library(car)
fit <- lm(Index ~ Year, data=Nif_data)
durbinWatsonTest(fit)
lag Autocorrelation D-W Statistic p-value
   1       0.9936623   0.006301032       0
 Alternative hypothesis: rho != 0