Welcome to e-Tutorial, your on-line help to Econ 536. This issue provides an introduction to dynamic models in Econometrics, and draws on Prof. Koenker’s Lecture Note 3. The adopted philosophy is “learn by doing”: the material is intended to help you to solve the problem set 2 and to enhance your understanding of the topics.1
The data set used in this tutorial was borrowed from Johnston and DiNardo’s Econometric Methods (1997, 4th ed), but slightly adjusted for your needs. It is called AUTO2. You can download the data by visiting the Econ 536 web site (Data). As you will see, this adapted data set contains five series.
auto<-read.table("http://www.econ.uiuc.edu/~econ536/Data/AUTO2.txt",header=T)
head(auto)
quarter gas price income miles
1 1959.1 -8.015248 4.675750 -4.505240 2.647592
2 1959.2 -8.011060 4.691292 -4.492739 2.647592
3 1959.3 -8.019878 4.689134 -4.498873 2.647592
4 1959.4 -8.012581 4.722338 -4.491904 2.647592
5 1960.1 -8.016769 4.707470 -4.490103 2.647415
6 1960.2 -7.976376 4.699136 -4.489107 2.647238
A useful recommendation for practitioners of Econometrics is to summarize the data set you are going to work with. This provides a “big picture” of the variables and what you can expect from them. You can do that by as follows:
summary(auto)
quarter gas price income
Min. :1959 Min. :-8.020 Min. :4.488 Min. :-4.505
1st Qu.:1967 1st Qu.:-7.836 1st Qu.:4.621 1st Qu.:-4.282
Median :1975 Median :-7.719 Median :4.676 Median :-4.158
Mean :1975 Mean :-7.763 Mean :4.724 Mean :-4.195
3rd Qu.:1983 3rd Qu.:-7.672 3rd Qu.:4.767 3rd Qu.:-4.106
Max. :1990 Max. :-7.606 Max. :5.176 Max. :-3.970
miles
Min. :2.584
1st Qu.:2.614
Median :2.647
Mean :2.713
3rd Qu.:2.811
Max. :3.036
Since you are most likely to write your reports in Latex a good idea may be to get from R your output in Latex format. There are multiple packages that will do so. Some packages are: xtable, texre, outreg, etc. At the moment, the “new hit” is stargazer
.
install.packages("stargazer") #Use this to install it, you only need to do it once
library(stargazer)
stargazer(auto, out="descriptive.tex")
This saves a descriptive.tex file in your working directory. Among other options you can save it as .text or .html. A good idea would be to read the help file: ?stargazer
. The Latex output looks like:
In order to estimate a time series model in R we need to transform the data in “time series” first. To do so we need to load two libraries:
install.packages("zoo")
install.packages("dyn")#you need to install the dynlm package first. Remember to do it only once.
library(dyn)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Notice that we installed the package zoo because dyn depends on it. The next step is to create the variables you will need to run dynamic models
gas<-ts(auto$gas,start=1959,frequency=4)
price<-ts(auto$price,start=1959,frequency=4)
income<-ts(auto$income,start=1959,frequency=4)
miles<-ts(auto$miles,start=1959,frequency=4)
Next we will try to replicate some of Johnston and DiNardo’s (1997, p. 267) graphical results
plot(gas)
plot(price)
The ACF plots are obtained easily with the acf()
function.
acf(gas)
acf(price)
Next let’s run a typical dynamic model.
f<-dyn$lm(gas~lag(gas,-1)+price)
summary(f)
Call:
lm(formula = dyn(gas ~ lag(gas, -1) + price))
Residuals:
Min 1Q Median 3Q Max
-0.107029 -0.007123 0.000328 0.010566 0.035961
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.129960 0.115034 -1.130 0.2608
lag(gas, -1) 0.970998 0.013543 71.697 <2e-16 ***
price -0.019652 0.009811 -2.003 0.0473 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01833 on 124 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.9765, Adjusted R-squared: 0.9761
F-statistic: 2572 on 2 and 124 DF, p-value: < 2.2e-16
In order to do differences, just type
f2<-dyn$lm(gas~lag(gas,-1)+price+lag(diff(gas),-1))
summary(f2)
Call:
lm(formula = dyn(gas ~ lag(gas, -1) + price + lag(diff(gas),
-1)))
Residuals:
Min 1Q Median 3Q Max
-0.10810 -0.00752 0.00061 0.01132 0.03990
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.12594 0.11686 -1.078 0.2833
lag(gas, -1) 0.96955 0.01379 70.289 <2e-16 ***
price -0.02281 0.01007 -2.265 0.0253 *
lag(diff(gas), -1) -0.12166 0.08928 -1.363 0.1755
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01833 on 122 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared: 0.976, Adjusted R-squared: 0.9754
F-statistic: 1652 on 3 and 122 DF, p-value: < 2.2e-16
If you want to reproduce the dynamic model given by Johnston and DiNardo, 1997, p. 269, Table 8.5, just type
f3<-dyn$lm(gas~price+lag(price,-1)+ lag(price,-2)+ lag(price,-3)+ lag(price,-4)+lag(price,-5)+income+lag(income,-1)+ lag(income,-2)+ lag(income,-3)+lag(income,-4)+lag(income,-5)+lag(gas,-1)+ lag(gas,-2)+ lag(gas,-3)+ lag(gas,-4)+lag(gas,-5))
summary(f3)
Call:
lm(formula = dyn(gas ~ price + lag(price, -1) + lag(price, -2) +
lag(price, -3) + lag(price, -4) + lag(price, -5) + income +
lag(income, -1) + lag(income, -2) + lag(income, -3) + lag(income,
-4) + lag(income, -5) + lag(gas, -1) + lag(gas, -2) + lag(gas,
-3) + lag(gas, -4) + lag(gas, -5)))
Residuals:
Min 1Q Median 3Q Max
-0.075543 -0.007502 -0.000365 0.007071 0.041426
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.005543 0.126043 0.044 0.965003
price -0.267645 0.037874 -7.067 1.81e-10 ***
lag(price, -1) 0.262869 0.069291 3.794 0.000248 ***
lag(price, -2) -0.017408 0.075414 -0.231 0.817895
lag(price, -3) -0.072092 0.077389 -0.932 0.353707
lag(price, -4) 0.014397 0.077210 0.186 0.852445
lag(price, -5) 0.058208 0.046340 1.256 0.211866
income 0.292773 0.158823 1.843 0.068092 .
lag(income, -1) -0.162201 0.220227 -0.737 0.463060
lag(income, -2) -0.049248 0.214371 -0.230 0.818748
lag(income, -3) 0.010408 0.213131 0.049 0.961146
lag(income, -4) 0.084907 0.210131 0.404 0.686984
lag(income, -5) -0.198963 0.153118 -1.299 0.196647
lag(gas, -1) 0.660575 0.096063 6.876 4.55e-10 ***
lag(gas, -2) 0.067011 0.114535 0.585 0.559754
lag(gas, -3) -0.023575 0.117094 -0.201 0.840825
lag(gas, -4) 0.132193 0.119014 1.111 0.269222
lag(gas, -5) 0.163127 0.101384 1.609 0.110620
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01506 on 105 degrees of freedom
(10 observations deleted due to missingness)
Multiple R-squared: 0.9842, Adjusted R-squared: 0.9816
F-statistic: 383.7 on 17 and 105 DF, p-value: < 2.2e-16
As everything in R you have multiple ways to do it. We could do the same analysis using the dynlm package which is less verbose.
install.packages("dynlm")
library(dynlm)
f_a<-dynlm(gas~L(gas,1)+price)
summary(f_a)
Time series regression with "ts" data:
Start = 1959(2), End = 1990(4)
Call:
dynlm(formula = gas ~ L(gas, 1) + price)
Residuals:
Min 1Q Median 3Q Max
-0.107029 -0.007123 0.000328 0.010566 0.035961
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.129960 0.115034 -1.130 0.2608
L(gas, 1) 0.970998 0.013543 71.697 <2e-16 ***
price -0.019652 0.009811 -2.003 0.0473 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01833 on 124 degrees of freedom
Multiple R-squared: 0.9765, Adjusted R-squared: 0.9761
F-statistic: 2572 on 2 and 124 DF, p-value: < 2.2e-16
which is the same as before. To see that this is package is less verbose lets run again model f3
f3_a<-dynlm(gas~L(price,0:5)+L(income,0:5)+L(gas,1:5))
summary(f3_a)
Time series regression with "ts" data:
Start = 1960(2), End = 1990(4)
Call:
dynlm(formula = gas ~ L(price, 0:5) + L(income, 0:5) + L(gas,
1:5))
Residuals:
Min 1Q Median 3Q Max
-0.075543 -0.007502 -0.000365 0.007071 0.041426
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.005543 0.126043 0.044 0.965003
L(price, 0:5)0 -0.267645 0.037874 -7.067 1.81e-10 ***
L(price, 0:5)1 0.262869 0.069291 3.794 0.000248 ***
L(price, 0:5)2 -0.017408 0.075414 -0.231 0.817895
L(price, 0:5)3 -0.072092 0.077389 -0.932 0.353707
L(price, 0:5)4 0.014397 0.077210 0.186 0.852445
L(price, 0:5)5 0.058208 0.046340 1.256 0.211866
L(income, 0:5)0 0.292773 0.158823 1.843 0.068092 .
L(income, 0:5)1 -0.162201 0.220227 -0.737 0.463060
L(income, 0:5)2 -0.049248 0.214371 -0.230 0.818748
L(income, 0:5)3 0.010408 0.213131 0.049 0.961146
L(income, 0:5)4 0.084907 0.210131 0.404 0.686984
L(income, 0:5)5 -0.198963 0.153118 -1.299 0.196647
L(gas, 1:5)1 0.660575 0.096063 6.876 4.55e-10 ***
L(gas, 1:5)2 0.067011 0.114535 0.585 0.559754
L(gas, 1:5)3 -0.023575 0.117094 -0.201 0.840825
L(gas, 1:5)4 0.132193 0.119014 1.111 0.269222
L(gas, 1:5)5 0.163127 0.101384 1.609 0.110620
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01506 on 105 degrees of freedom
Multiple R-squared: 0.9842, Adjusted R-squared: 0.9816
F-statistic: 383.7 on 17 and 105 DF, p-value: < 2.2e-16
and using the stargazer
package we get:
Suppose you wish to compute the long run income and price elasticities. Assuming that in the long-run the variables converge to their respective steady-state values (represented by “e”):
\[gas_{t}=gas_{t-1}=gas_{t-2}=...=gas_{e}\]
\[price_{t}=price_{t-1}=price_{t-2}=...=price_{e} \] \[income(t)=income(t-1)=income(t-2)=...=income_{e}, \]
and recalling that all variables are already in logs, you then just have to apply this steady state condition, reparameterize the model, and calculate the elasticities:
\[ \frac{d (gas(e))}{ d (income(e))}= \text{long run income elasticity} \] \[ \frac{d (gas(e))}{d (price(e))} = \text{long run price elasticity } \]
After that you can compare the long-run elasticities of the (reparameterized) dynamic model with the elasticities provided by the static version of this log-linear regression:
summary(lm(gas~price+income))
Call:
lm(formula = gas ~ price + income)
Residuals:
Min 1Q Median 3Q Max
-0.11798 -0.04292 -0.01194 0.05919 0.10653
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.09334 0.22476 -18.212 < 2e-16 ***
price -0.15028 0.03123 -4.812 4.23e-06 ***
income 0.70561 0.03373 20.916 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.0571 on 125 degrees of freedom
Multiple R-squared: 0.7779, Adjusted R-squared: 0.7743
F-statistic: 218.9 on 2 and 125 DF, p-value: < 2.2e-16
I suggest you to provide a little table comparing those results, and write your comments about how the elasticities differ from the static to the dynamic model.
In the light of the problem set, I suggest you to compute not only the point estimate of the elasticities, but also their confidence intervals. In the static model, confidence intervals are obtained directly from the regression output. But for the dynamic model, the elasticities are represented by a non-linear function of the parameters. In that case, you need to find confidence intervals for the elasticities using Delta-method or Bootstrap techniques, which you will see in professor Koenker’s Lecture Note 5 and we will address in a future e-TA.
Here I recommend to use the best dynamic model (following the Schwarz Information Criterion that you will see in e-Tutorial 4), and to compute impulse response functions using the formula on Prof. Koenker’s Lecture Note 3.
P.S.: Some authors propose alternative ways to calculate impulse response functions. One of them is to use partial derivatives (e.g., Enders, 1995, p.24), but such method has a drawback: it is quite easy to make a mistake when you have models with many lags and differences.
Usually it is expected that you account for a reasonable amount of response periods, depending on the structure of your data set. Usually we suggest a minimum of 40 (forty) response periods for quarterly data. Then you plot those responses along the respective time scale (t=0,1,2,3,…,40). This will generate the non-cumulative impulse response function. If you wish a cumulative impulse response function, at each new period t+i (i=1,2,3,…), you should add the effect to the previous shocks.
There is no mystery in doing that. You just have to add and subtract \(y_{t-1}\) and \(\beta_{0} x_{t-1}\) on the model, reparameterize it, and obtain a dynamic equation in which the difference of the response (dependent) variable is decomposed into two parts: a) direct effects from changes in the explanatory variables, and b) indirect effects from changes on the response variable during previous periods, while it was out-of-equilibrium. (See, for example, Prof. Koenker’s Lecture Note 3)
Please send comments to hrtdmrt2@illinois.edu or srmntbr2@illinois.edu↩