Applied econometrics: Econ 508

## TA: Nicolas Bottan

Welcome to e-Tutorial, your on-line help to Econ508. This issue provides an introduction to dynamic models in Econometrics, and draws on Prof. Koenker’s Lecture Note 3. The adopted philosophy is “learn by doing”: the material is intended to help you to solve the problem set 2 and to enhance your understanding of the topics.1

# Data Set

The data set used in this tutorial was borrowed from Johnston and DiNardo’s Econometric Methods (1997, 4th ed), but slightly adjusted for your needs. It is called AUTO2. You can download the data by visiting the Econ 508 web site (Data). As you will see, this adapted data set contains five series.

    auto<-read.table("http://www.econ.uiuc.edu/~econ508/data/AUTO2.txt",header=T)
head(auto)
  quarter    gas price income miles
1    1959 -8.015 4.676 -4.505 2.648
2    1959 -8.011 4.691 -4.493 2.648
3    1959 -8.020 4.689 -4.499 2.648
4    1959 -8.013 4.722 -4.492 2.648
5    1960 -8.017 4.707 -4.490 2.647
6    1960 -7.976 4.699 -4.489 2.647

# Summarizing the data

A useful recommendation for practitioners of Econometrics is to summarize the data set you are going to work with. This provides a “big picture” of the variables and what you can expect from them. You can do that by as follows:

    summary(auto)    
    quarter          gas            price          income
Min.   :1959   Min.   :-8.02   Min.   :4.49   Min.   :-4.50
1st Qu.:1967   1st Qu.:-7.84   1st Qu.:4.62   1st Qu.:-4.28
Median :1975   Median :-7.72   Median :4.68   Median :-4.16
Mean   :1975   Mean   :-7.76   Mean   :4.72   Mean   :-4.20
3rd Qu.:1983   3rd Qu.:-7.67   3rd Qu.:4.77   3rd Qu.:-4.11
Max.   :1990   Max.   :-7.61   Max.   :5.18   Max.   :-3.97
miles
Min.   :2.58
1st Qu.:2.61
Median :2.65
Mean   :2.71
3rd Qu.:2.81
Max.   :3.04  

Since you are most likely to write your reports in Latex a good idea may be to get from R your output in Latex format. There are multiple packages that will do so. Some packages are: xtable, texre, outreg, etc. At the moment, the “new hit” is stargazer.

    install.packages("stargazer") #Use this to install it, you only need to do it once
library(stargazer)
stargazer(auto, out="descriptive.tex")

This saves a descriptive.tex file in your working directory. Among other options you can save it as .text or .html. A good idea would be to read the help file: ?stargazer. The Latex output looks like:

# Working with Time Series in R

In order to estimate a time series model in R we need to transform the data in “time series” first. To do so we need to load two libraries:

    install.packages("zoo")
install.packages("dyn")#you need to install the dynlm package first. Remember to do it only once.
    library(dyn)
Loading required package: zoo

Attaching package: 'zoo'

The following objects are masked from 'package:base':

as.Date, as.Date.numeric

Notice that we installed the package zoo because dyn depends on it. The next step is to create the variables you will need to run dynamic models

    gas<-ts(auto$gas,start=1959,frequency=4) price<-ts(auto$price,start=1959,frequency=4)
income<-ts(auto$income,start=1959,frequency=4) miles<-ts(auto$miles,start=1959,frequency=4)

# Graphing Time Series

Next we will try to replicate some of Johnston and DiNardo’s (1997, p. 267) graphical results

    plot(gas)

    plot(price)

The ACF plots are obtained easily with the acf() function.

    acf(gas)

    acf(price)

# Running Dynamic Models

Next let’s run a typical dynamic model.

    f<-dyn$lm(gas~lag(gas,-1)+price) summary(f)  Call: lm(formula = dyn(gas ~ lag(gas, -1) + price)) Residuals: Min 1Q Median 3Q Max -0.10703 -0.00712 0.00033 0.01057 0.03596 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.12996 0.11503 -1.13 0.261 lag(gas, -1) 0.97100 0.01354 71.70 <2e-16 *** price -0.01965 0.00981 -2.00 0.047 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.0183 on 124 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.976, Adjusted R-squared: 0.976 F-statistic: 2.57e+03 on 2 and 124 DF, p-value: <2e-16 In order to do differences, just type  f2<-dyn$lm(gas~lag(gas,-1)+price+lag(diff(gas),-1))
summary(f2)

Call:
lm(formula = dyn(gas ~ lag(gas, -1) + price + lag(diff(gas),
-1)))

Residuals:
Min       1Q   Median       3Q      Max
-0.10810 -0.00752  0.00061  0.01132  0.03990

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)         -0.1259     0.1169   -1.08    0.283
lag(gas, -1)         0.9695     0.0138   70.29   <2e-16 ***
price               -0.0228     0.0101   -2.26    0.025 *
lag(diff(gas), -1)  -0.1217     0.0893   -1.36    0.175
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0183 on 122 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared:  0.976, Adjusted R-squared:  0.975
F-statistic: 1.65e+03 on 3 and 122 DF,  p-value: <2e-16

If you want to reproduce the dynamic model given by Johnston and DiNardo, 1997, p. 269, Table 8.5, just type

    f3<-dyn\$lm(gas~price+lag(price,-1)+ lag(price,-2)+ lag(price,-3)+ lag(price,-4)+lag(price,-5)+income+lag(income,-1)+ lag(income,-2)+ lag(income,-3)+lag(income,-4)+lag(income,-5)+lag(gas,-1)+ lag(gas,-2)+ lag(gas,-3)+ lag(gas,-4)+lag(gas,-5))
summary(f3)

Call:
lm(formula = dyn(gas ~ price + lag(price, -1) + lag(price, -2) +
lag(price, -3) + lag(price, -4) + lag(price, -5) + income +
lag(income, -1) + lag(income, -2) + lag(income, -3) + lag(income,
-4) + lag(income, -5) + lag(gas, -1) + lag(gas, -2) + lag(gas,
-3) + lag(gas, -4) + lag(gas, -5)))

Residuals:
Min       1Q   Median       3Q      Max
-0.07554 -0.00750 -0.00037  0.00707  0.04143

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)      0.00554    0.12604    0.04  0.96500
price           -0.26765    0.03787   -7.07  1.8e-10 ***
lag(price, -1)   0.26287    0.06929    3.79  0.00025 ***
lag(price, -2)  -0.01741    0.07541   -0.23  0.81789
lag(price, -3)  -0.07209    0.07739   -0.93  0.35371
lag(price, -4)   0.01440    0.07721    0.19  0.85244
lag(price, -5)   0.05821    0.04634    1.26  0.21187
income           0.29277    0.15882    1.84  0.06809 .
lag(income, -1) -0.16220    0.22023   -0.74  0.46306
lag(income, -2) -0.04925    0.21437   -0.23  0.81875
lag(income, -3)  0.01041    0.21313    0.05  0.96115
lag(income, -4)  0.08491    0.21013    0.40  0.68698
lag(income, -5) -0.19896    0.15312   -1.30  0.19665
lag(gas, -1)     0.66057    0.09606    6.88  4.6e-10 ***
lag(gas, -2)     0.06701    0.11453    0.59  0.55975
lag(gas, -3)    -0.02358    0.11709   -0.20  0.84082
lag(gas, -4)     0.13219    0.11901    1.11  0.26922
lag(gas, -5)     0.16313    0.10138    1.61  0.11062
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0151 on 105 degrees of freedom
(10 observations deleted due to missingness)
Multiple R-squared:  0.984, Adjusted R-squared:  0.982
F-statistic:  384 on 17 and 105 DF,  p-value: <2e-16

### The dynlm package

As everything in R you have multiple ways to do it. We could do the same analysis using the dynlm package which is less verbose.

    install.packages("dynlm")    
    library(dynlm)

f_a<-dynlm(gas~L(gas,1)+price)
summary(f_a)

Time series regression with "ts" data:
Start = 1959(2), End = 1990(4)

Call:
dynlm(formula = gas ~ L(gas, 1) + price)

Residuals:
Min       1Q   Median       3Q      Max
-0.10703 -0.00712  0.00033  0.01057  0.03596

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.12996    0.11503   -1.13    0.261
L(gas, 1)    0.97100    0.01354   71.70   <2e-16 ***
price       -0.01965    0.00981   -2.00    0.047 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0183 on 124 degrees of freedom
Multiple R-squared:  0.976, Adjusted R-squared:  0.976
F-statistic: 2.57e+03 on 2 and 124 DF,  p-value: <2e-16

which is the same as before. To see that this is package is less verbose lets run again model f3

    f3_a<-dynlm(gas~L(price,0:5)+L(income,0:5)+L(gas,1:5))
summary(f3_a)

Time series regression with "ts" data:
Start = 1960(2), End = 1990(4)

Call:
dynlm(formula = gas ~ L(price, 0:5) + L(income, 0:5) + L(gas,
1:5))

Residuals:
Min       1Q   Median       3Q      Max
-0.07554 -0.00750 -0.00037  0.00707  0.04143

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)      0.00554    0.12604    0.04  0.96500
L(price, 0:5)0  -0.26765    0.03787   -7.07  1.8e-10 ***
L(price, 0:5)1   0.26287    0.06929    3.79  0.00025 ***
L(price, 0:5)2  -0.01741    0.07541   -0.23  0.81789
L(price, 0:5)3  -0.07209    0.07739   -0.93  0.35371
L(price, 0:5)4   0.01440    0.07721    0.19  0.85244
L(price, 0:5)5   0.05821    0.04634    1.26  0.21187
L(income, 0:5)0  0.29277    0.15882    1.84  0.06809 .
L(income, 0:5)1 -0.16220    0.22023   -0.74  0.46306
L(income, 0:5)2 -0.04925    0.21437   -0.23  0.81875
L(income, 0:5)3  0.01041    0.21313    0.05  0.96115
L(income, 0:5)4  0.08491    0.21013    0.40  0.68698
L(income, 0:5)5 -0.19896    0.15312   -1.30  0.19665
L(gas, 1:5)1     0.66057    0.09606    6.88  4.6e-10 ***
L(gas, 1:5)2     0.06701    0.11453    0.59  0.55975
L(gas, 1:5)3    -0.02358    0.11709   -0.20  0.84082
L(gas, 1:5)4     0.13219    0.11901    1.11  0.26922
L(gas, 1:5)5     0.16313    0.10138    1.61  0.11062
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0151 on 105 degrees of freedom
Multiple R-squared:  0.984, Adjusted R-squared:  0.982
F-statistic:  384 on 17 and 105 DF,  p-value: <2e-16

and using the stargazer package we get:

# Long-run Elasticities

Suppose you wish to compute the long run income and price elasticities. Assuming that in the long-run the variables converge to their respective steady-state values (represented by “e”):

$gas_{t}=gas_{t-1}=gas_{t-2}=...=gas_{e}$
$price_{t}=price_{t-1}=price_{t-2}=...=price_{e}$ $income(t)=income(t-1)=income(t-2)=...=income_{e},$

and recalling that all variables are already in logs, you then just have to apply this steady state condition, reparameterize the model, and calculate the elasticities:

$\frac{d (gas(e))}{ d (income(e))}= \text{long run income elasticity}$ $\frac{d (gas(e))}{d (price(e))} = \text{long run price elasticity }$

After that you can compare the long-run elasticities of the (reparameterized) dynamic model with the elasticities provided by the static version of this log-linear regression:

    summary(lm(gas~price+income))

Call:
lm(formula = gas ~ price + income)

Residuals:
Min      1Q  Median      3Q     Max
-0.1180 -0.0429 -0.0119  0.0592  0.1065

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  -4.0933     0.2248  -18.21  < 2e-16 ***
price        -0.1503     0.0312   -4.81  4.2e-06 ***
income        0.7056     0.0337   20.92  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0571 on 125 degrees of freedom
Multiple R-squared:  0.778, Adjusted R-squared:  0.774
F-statistic:  219 on 2 and 125 DF,  p-value: <2e-16

I suggest you to provide a little table comparing those results, and write your comments about how the elasticities differ from the static to the dynamic model.

In the light of the problem set, I suggest you to compute not only the point estimate of the elasticities, but also their confidence intervals. In the static model, confidence intervals are obtained directly from the regression output. But for the dynamic model, the elasticities are represented by a non-linear function of the parameters. In that case, you need to find confidence intervals for the elasticities using Delta-method or Bootstrap techniques, which you will see in professor Koenker’s Lecture Note 5 and we will address in a future e-TA.

# Impulse Response Functions:

Here I recommend to use the best dynamic model (following the Schwarz Information Criterion that you will see in e-Tutorial 4), and to compute impulse response functions using the formula on Prof. Koenker’s Lecture Note 3.

P.S.: Some authors propose alternative ways to calculate impulse response functions. One of them is to use partial derivatives (e.g., Enders, 1995, p.24), but such method has a drawback: it is quite easy to make a mistake when you have models with many lags and differences.

Usually it is expected that you account for a reasonable amount of response periods, depending on the structure of your data set. Usually we suggest a minimum of 40 (forty) response periods for quarterly data. Then you plot those responses along the respective time scale (t=0,1,2,3,…,40). This will generate the non-cumulative impulse response function. If you wish a cumulative impulse response function, at each new period t+i (i=1,2,3,…), you should add the effect to the previous shocks.

# Error Correction Model (ECM):

There is no mystery in doing that. You just have to add and subtract $$y_{t-1}$$ and $$\beta_{0} x_{t-1}$$ on the model, reparameterize it, and obtain a dynamic equation in which the difference of the response (dependent) variable is decomposed into two parts: a) direct effects from changes in the explanatory variables, and b) indirect effects from changes on the response variable during previous periods, while it was out-of-equilibrium. (See, for example, Prof. Koenker’s Lecture Note 3)