Welcome to this new issue of e-Tutorial. We focus now on time series models, with special emphasis on the tests of unit roots and cointegration. We would like to remark that the theoretical background given in class is essential to proceed with the computational exercise below. Thus, I recommend you to study Prof. Koenker’s Lectures 8 and 9 as you go through the tutorial.1
The first thing you need is to download the updated Thurman and Fisher (1988) data, called eggs.csv from the Econ 536 web site. Save it in your preferred directory (I will save my as “C:/eggs.csv”.) The next step is inserting the Data in R:
Thurman<-read.table("C:/eggs1.txt", header=T, sep=",")
An alternative is to call your data from the web:
Thurman<-read.table("http://www.econ.uiuc.edu/~econ536/Data/eggs.csv", header=T, sep=",")
The next step is to declare chickens and eggs as time series:
year<-ts(Thurman$year)
chic<-ts(Thurman$chic)
egg<-ts(Thurman$egg)
At first, it is important that you to sketch the ADF test, explaining the NULL and the ALTERNATIVE hypotheses.
ADF Test in R: I suggest you to use the R code adf.R, available at http://www.econ.illinois.edu/~econ536/Routines/:
"adf" <- function(x,k = 0, int = TRUE, trend = FALSE){
# NB: returns conventional lm summary so p-values for adf test are wrong!
require(dynlm)
dx <- diff(x)
formula <- paste("dx ~ L(x)")
if(k > 0)
formula <- paste(formula," + L(dx,1:k)")
if(trend){
s <- time(x)
t <- ts(s - s[1],start = s[1],freq = frequency(x))
formula <- paste(formula," + t")
}
if(!int) formula <- paste(formula," - 1")
summary(dynlm(as.formula(formula)))
}
}
Your job is to copy the R code above and paste in the R console. This will create a R function called “adf”, which runs the unit root test for each case. You should use the ADF test for each individual series (chickens and eggs), controlling for the number of lags, and the inclusion of constants and trends.
If you don’t feel like downloading it and doing the copy pasting you can directly source it from the web page
source("http://www.econ.uiuc.edu/~econ536/Routines/adf.R")
Examples DF for Chickens
adf(chic, k=1, int=T, trend=T)
Loading required package: dynlm
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Time series regression with "ts" data:
Start = 3, End = 75
Call:
dynlm(formula = as.formula(formula))
Residuals:
Min 1Q Median 3Q Max
-55584 -10044 1244 8846 77813
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.711e+04 3.097e+04 1.521 0.1328
L(x) -1.143e-01 6.814e-02 -1.678 0.0979 .
L(dx, 1:k) -9.758e-02 1.223e-01 -0.798 0.4275
t -2.080e-01 1.385e+02 -0.002 0.9988
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21970 on 69 degrees of freedom
Multiple R-squared: 0.07153, Adjusted R-squared: 0.03116
F-statistic: 1.772 on 3 and 69 DF, p-value: 0.1606
Then you can test the significance of the coefficient L(x) by using the appropriate Dickey & Fuller critical values (Table B.6 from Hamilton 1994). You can access the DF Test tables given by Hamilton(1994) by clicking HERE.
Here the null hypothesis is the presence of unit root. Thus, the augmented Dickey-Fuller statistic is -1.678, and lies inside the acceptance region at 1%, 5%, and 10%, as you can see form the tables. Therefore, we cannot reject the presence of unit root.
If you don’t want to use the tables, there’s a package in R called fUnitRoots
that gives you the DFtable
install.packages("fUnitRoots")
require(fUnitRoots)
The qadf
function will give you the quantiles for the ADF test
qadf(0.01, N=75, trend="ct")
From this starting point, you can add lags by changing k=1 to k=2 or k=3 or k=4 and so on. If wish to exclude the intercept, just substitute int=T by int=F. (As usual, T means true, i.e., inclusion, and F means false, i.e., exclusion). The same applies to the inclusion/exclusion of trend.
My suggestion is that you run 3 different types of ADF, each of them including 1, 2, 3, and 4 lags: (i) Models with intercept and trend (int=T, trend=T) (ii) Models with intercept but without trend (int=T, trend=F) (iii) Models without intercept and without trend (int=F, trend=F)
adf(chic, k=1, int=T, trend=F)
Time series regression with "ts" data:
Start = 3, End = 75
Call:
dynlm(formula = as.formula(formula))
Residuals:
Min 1Q Median 3Q Max
-55583 -10040 1240 8838 77817
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.708e+04 2.480e+04 1.898 0.0618 .
L(x) -1.143e-01 5.991e-02 -1.908 0.0605 .
L(dx, 1:k) -9.763e-02 1.181e-01 -0.826 0.4114
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21810 on 70 degrees of freedom
Multiple R-squared: 0.07153, Adjusted R-squared: 0.045
F-statistic: 2.696 on 2 and 70 DF, p-value: 0.07445
adf(chic, k=1, int=F, trend=F)
Time series regression with "ts" data:
Start = 3, End = 75
Call:
dynlm(formula = as.formula(formula))
Residuals:
Min 1Q Median 3Q Max
-63272 -8100 1536 9986 73866
Coefficients:
Estimate Std. Error t value Pr(>|t|)
L(x) -0.001164 0.006278 -0.185 0.853
L(dx, 1:k) -0.151569 0.116748 -1.298 0.198
Residual standard error: 22210 on 71 degrees of freedom
Multiple R-squared: 0.02374, Adjusted R-squared: -0.003759
F-statistic: 0.8633 on 2 and 71 DF, p-value: 0.4261
Do that for each individual series. This will generate 12 regressions for chickens, and 12 for eggs. Very likely, some of them will indicate the presence of unit root, while others will not. The choice of the best model can be done by calculating AIC, SIC or any other reasonable criterion. At the end, please provide a table with the summary of your results, and draw your conclusions.
After performing the test on the three models, what can you conclude?
The adf
function gave you the tests for the chickens annual series, using 1 lag. We recommend you to repeat these 3 processes for lags 2,3,and 4 as well. After you complete this cycle for chickens, you need to do the same cycle for eggs. At the end of both cycles, you will have 24 regression outputs. If you prefer, you don’t need to report all output details, but rather concentrate on the ADF test statistics of each equation. Think that you are writing an academic paper. Don’t spend too much space with intermediary results; concentrate instead on your final conclusions, which can be paradoxical as you go through different tetsting steps. By the end of the day you are expected to summarize your main results in a table, and then to write a paragraph with comments on the different results you can obtain when you include/exclude trends/constants/lags for both chickens and eggs series.
The first thing you should do always is to sketch the Engle-Granger test, explaining the NULL and the ALTERNATIVE hypotheses. :
Engle-Granger in R: The test can be done in 3 steps, as follows:
Pre-test the variables for the presence of unit roots (done above) and check if they are integrated of the same order
Regress the long run equilibrium model of chickens vs. eggs
Engle<-lm(chic~egg)
summary(Engle)
Call:
lm(formula = chic ~ egg)
Residuals:
Min 1Q Median 3Q Max
-57843 -30963 -15177 18559 169232
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 446387.634 27575.931 16.188 <2e-16 ***
egg -6.229 5.055 -1.232 0.222
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 44200 on 73 degrees of freedom
Multiple R-squared: 0.02037, Adjusted R-squared: 0.006955
F-statistic: 1.518 on 1 and 73 DF, p-value: 0.2218
Obtain the residuals.
residual<-resid(Engle)
Plot the residuals along time.
ts.plot(year,residual, gpars=list(main="Chickens vs. eggs: Is there cointegration?", xlab="year", ylab="residuals"))
Plot also the residuals versus lagged residuals. Draw your conclusions
Some authors (e.g., Enders, 1995) consider a fourth step, consisting in the estimation of error-correction models and checking of models adequacy. However, you are not required to do that for the purposes of the problem set 3.
At the end of the test, please provide a table summarizing your results. Comment your findings.
Again we recommend you to sketch the Johansen test, explaining the NULL and the ALTERNATIVE hypotheses. Then we suggest you to use the R code johansen.R, provided by Prof. Koenker, and available at http://www.econ.uiuc.edu/~econ472/routines.html:
"johansen"<- function(x, L = 2){
#Johansen Test of cointegration for multivariate time series x
#Returns vector of eigenvalues after that you are on your own.
#This is a modified version for R, in which rts is substituted by ts.
x <- ts(x)
n <- nrow(x)
p <- ncol(x)
Ly <- lag(x[, 1], -1)
D <- diff(x[, 1])
for(i in 1:p) {
if(i > 1) {
D <- ts.intersect(D, diff(x[, i]))
Ly <- ts.intersect(Ly, lag(x[, i], -1))
}
if(L > 0)
for(j in 1:L)
D <- ts.intersect(D, lag(diff(x[, i]), - j))
}
iys <- 1 + (L + 1) * (0:(p - 1))
Y <- D[, iys]
X <- D[, - iys]
Ly <- ts.intersect(Ly, D)[, 1:p]
ZD <- lm(Y ~ X)$resid
ZL <- lm(Ly ~ X)$resid
df <- nrow(X) - ncol(X) - 1
S00 <- crossprod(ZD)/df
S11 <- crossprod(ZL)/df
S01 <- crossprod(ZD, ZL)/df
M <- solve(S11) %*% t(S01) %*% solve(S00) %*% S01
eigen(M)$values
}
Your job is to copy the code above and paste in the R console. This will create a R function called “johansen” that calculates the eigenvalues.
Once again, if you don’t feel like downloading it and doing the copy pasting you can directly source it from the web page
source("http://www.econ.uiuc.edu/~econ536/Routines/johansen.R")
The command to obtain the eigenvalues is:
johansen(cbind(egg,chic), L=1)
[1] 0.1414344 0.0116857
The code above refers to the case including trend and intercept, and the appropriate critical values should be used. Note that the theoretical background here is essential, given that you need to interpret the eigenvalues and calculate the test statistic by yourself, before to draw your conclusions.
Please send comments to hrtdmrt2@illinois.edu or srmntbr2@illinois.edu↩
Comments on Unit Root Tests:
Unit root tests are very sensitive to the number of included lags and/or constant and trends. That’s the reason by which we are asking you to show all ADF statistics in the table above. Very likely, some of the results will indicate the presence of unit root while others will not.
How to make a general conclusion on the test results with so many models available? Johnston & DiNardo (1997, p.226), for example, mention that one of the objectives of including lags is to achieve white noise residuals. Other authors recommend the use AIC or SIC in the model selection.
It is quite simple to calculate information criteria in ADF tests. Each output of
adf
corresponds to a linear regression on the lags, constant, and/or trend of the series. From OLS regression, you recover the sample size, the RSS, and the # of parameters requested to calculate SIC or AIC, plus the original ADF statistic. But remember to use the Dickey-Fuller critical values.