## TA: Nicolas Bottan

Welcome to e-Tutorial, your on-line help to Econ508. This issue provides an introduction on how to do the piratical works about the Delta-method and bootstrap in R. Hope this will be helpful for your further understanding of Prof. Koenker’s Lecture 5.1

# Data Set

The data set used in this tutorial was borrowed from Johnston and DiNardo’s Econometric Methods (1997, 4th ed), but slightly adjusted for your needs. It is called AUTO2. You can download the data by visiting the Econ 508 web site (Data). As you will see, this adapted data set contains five series.

    use AUTO2, clear    describe
  obs:           128                          AUTO2 adapted from Johnston and DiNard vars:             5                          11 Sep 2002 12:22 size:         2,560                          ------------------------------------------------------------------------------------              storage  display     valuevariable name   type   format      label      variable label------------------------------------------------------------------------------------quarter         float  %9.0g                  Quarter of the observationgas             float  %9.0g                  Log of per capita real expenditure on price           float  %9.0g                  Log of the real price of gasoline and income          float  %9.0g                  Log of per capita real disposable incomiles           float  %9.0g                  Log of miles per gallon------------------------------------------------------------------------------------Sorted by:  

As we did before we need to transform the data in “time series” first:

    gen t = _n        label variable t "Integer time period"    tsset t
         time variable:  t, 1 to 128                delta:  1 unit

# Running a Dynamic Model with Quadratic and Multiplicative Terms

In the problem set 2, question 4, you are asked to run a linear regression model with non-linear transformation of variables. Suppose for a moment that you have in your hands a data set like the one used here (auto2.dta), and would like to estimate an equation similar to the problem set:

$gas_{t} = b_{0} + b_{1} income_{t} + b_{2} price_{t} + b_{3} price_{t} ^2 + b_{4} (price_{t}*income_{t}) + u_{t}$

first you need first to generate the quadratic and other terms as follows:

    gen price2= price^2     gen priceinc= price*income

then regress:

    regress gas income price price2 priceinc
      Source |       SS       df       MS              Number of obs =     128-------------+------------------------------           F(  4,   123) =  117.59       Model |  1.45421455     4  .363553638           Prob > F      =  0.0000    Residual |   .38027632   123  .003091677           R-squared     =  0.7927-------------+------------------------------           Adj R-squared =  0.7860       Total |  1.83449087   127   .01444481           Root MSE      =   .0556------------------------------------------------------------------------------         gas |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]-------------+----------------------------------------------------------------      income |  -6.109449   2.750048    -2.22   0.028      -11.553   -.6658978       price |   3.090076   2.894747     1.07   0.288    -2.639898     8.82005      price2 |   .2839261   .1693137     1.68   0.096      -.05122    .6190723    priceinc |   1.454257   .5879186     2.47   0.015      .290508    2.618006       _cons |  -25.52231    11.8968    -2.15   0.034     -49.0713   -1.973312------------------------------------------------------------------------------

Now, suppose you are asked to calculate the price elasticity of demand at different points of the sample. To do that you will need to:

1. Obtain the coefficients of regression:
    matrix B=get(_b)     matrix list B
B[1,5]        income       price      price2    priceinc       _consy1  -6.1094494   3.0900762   .28392614   1.4542568  -25.522308 
1. Extract the coefficients to scalars:
    scalar b1=_coef[income]     scalar b2=_coef[price]     scalar b3=_coef[price2]     scalar b4=_coef[priceinc]     scalar b0=_coef[_cons]      scalar list b1 b2 b3 b4 b0
Or another way to do it: scalar b1=B[1,1] or scalar b1=_b[income]
iii.  Calculate the elasticity at different:
    gen elastpt=b2+2*b3*price+b4*income     summarize elastpt, detail
                           elastpt-------------------------------------------------------------      Percentiles      Smallest 1%    -.7897052      -.8065645 5%    -.7665387      -.789705210%    -.7234885      -.7795596       Obs                 12825%    -.4959842      -.7698234       Sum of Wgt.         12850%    -.2754352                      Mean          -.3273864                        Largest       Std. Dev.      .259818275%    -.0774807       .034793590%    -.0073602       .0391868       Variance       .067505595%     .0219321       .0460309       Skewness      -.260058999%     .0460309       .0555678       Kurtosis       1.835793 
iv. Ok. You've just created your elasticity series. Now you can study it at different points of the sample, plot it against year  (check for inter-temporal structural breaks), against price, and against income.
scatter elastpt  quarter, xlabel(1960.1 (5) 1990.1) t1title(Price Elasticity vs. Time) 

    scatter elastpt gas, t1title(Price Elasticity vs. Gas Expenditure)

    scatter elastpt price, t1title(Price Elasticity vs. Price)

## Tip for problem set 2, question 5.

For this question you should:

1. Interpret the implications of the model.

2. Calculate the price such that $$[b_{2}+2*b_{3}*price+b_{4}*income] = -1$$ Given the formula of elasticity, and assuming $$income=x_{0}$$, just find the optimal price. Call it $$p^{*}$$.

3. Examine the partial residual plots

4. Finally, in the last part of the question 5 you are asked to estimate model 2 and to compute the revenue maximizing price level assuming $$income=\30.000$$ per year. This is a straightforward computation. The interesting part come with the application of the delta method and the bootstrap to achieve reasonable confidence intervals for the optimal price.

# Delta-method and Bootstrap

Note that we obtained point estimates. To compute confidence intervals, you will need the Delta-method and/or Bootstrap. In the problem set you are asked to assume that $$income=\30.000$$ per year. For this e-ta, we will assume $$income=log(15)=2.708050$$ approximately

## Delta-method

For the problem set you are expected to sketch the Delta-method and calculate the derivatives by hand along with the computational routine below.

1. Start running the full model again:
   regress gas income price price2 priceinc 
       Source |       SS       df       MS              Number of obs =     128-------------+------------------------------           F(  4,   123) =  117.59       Model |  1.45421455     4  .363553638           Prob > F      =  0.0000    Residual |   .38027632   123  .003091677           R-squared     =  0.7927-------------+------------------------------           Adj R-squared =  0.7860       Total |  1.83449087   127   .01444481           Root MSE      =   .0556------------------------------------------------------------------------------         gas |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]-------------+----------------------------------------------------------------      income |  -6.109449   2.750048    -2.22   0.028      -11.553   -.6658978       price |   3.090076   2.894747     1.07   0.288    -2.639898     8.82005      price2 |   .2839261   .1693137     1.68   0.096      -.05122    .6190723    priceinc |   1.454257   .5879186     2.47   0.015      .290508    2.618006       _cons |  -25.52231    11.8968    -2.15   0.034     -49.0713   -1.973312------------------------------------------------------------------------------
1. Recall that you have stored those coefficients under the names b0, b1, b2 , b3, b4. Now you need to get the covariance matrix V of them:
    matrix V=get(VCE)     matrix list V
 symmetric V[5,5]              income       price      price2    priceinc       _cons  income   7.5627648   price  -6.5629988   8.3795596  price2  -.00833388  -.27004512   .02866714priceinc  -1.6166822    1.405926   .00149168   .34564825   _cons    30.88608  -33.229023   .62976256  -6.6089597   141.53396
1. Next you need to input pstar:
  scalar pstar = -(1+b2+b4*ln(15))/(2*b3)   scalar list pstar
  pstar = -14.137967
1. And then you need to obtain the gradient vector with the partial derivatives of pstar (optimal price) with respect to each regressor, obeying the order of the covariance matrix:

$G'= (\frac{\partial{p^{*}}}{\partial{cons}}, \frac{\partial{p^{*}}}{\partial{income}}, \frac{\partial{p^{*}}}{\partial{price}}, \frac{\partial{p^{*}}}{\partial{price2}}, \frac{\partial{p^{*}}}{\partial{price\_income}})$

In Stata the vector will be as follows (note that the backslashes separate each row of the vector):

  matrix G=( 0  \  -1/(2*b3)  \  (1+b2+b4*ln(15))/(2*b3*b3)  \  -ln(15)/(2*b3)  \  0)   matrix list G
 G[5,1]             c1 r1           0 r2  -1.7610213 r3   49.794522 r4  -4.7689342 r5           0 
1. The next step is to obtain the estimated variance of the optimal price, $$G'VG$$:
   matrix GVG=G'*V*G    matrix list
symmetric GVG[1,1]           c1c1  175.19377
1. Then you can obtain the standard error by taking the square root of this scalar value:
   scalar sepstar=sqrt(el(GVG,1,1))    scalar list sepstar
   sepstar =  13.236079

That’s it. The variable sepstar is the standard error of pstar.

1. The last step is to construct your confidence interval, and put the variables in levels, as follows:
   scalar CIupper=pstar + 1.96*sepstar    scalar CIlower=pstar - 1.96*sepstar    scalar list CIlower pstar CIupper 
   CIlower =  -40.08068     pstar = -14.137967   CIupper =  11.804747

This will give you the confidence interval of the optimal price level. Observe that the results you obtained are for prices in logs. You can try to get the respective results in levels as well, but it is worth to think about whether you can do that directly or you need some additional step because of the non-linearity of the point estimate.

## Bootstrap

You are expected to explain the various Bootstrap techniques in words (as if you are explaining for a non-econometrician), along with the complete understanding of the results provided by STATA.

In STATA, you can calculate parametric (Normal), percentiled, and bias corrected bootstrapped confidence intervals for the optimal price as follows:

 bs pr=(-1*(1+_b[price]+_b[priceinc]*ln(15))/(2*_b[price2])), reps(1000):     reg gas income price price2 priceinc
Bootstrap replications (1000)----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 ..................................................    50..................................................   100..................................................   150..................................................   200..................................................   250..................................................   300..................................................   350..................................................   400..................................................   450..................................................   500..................................................   550..................................................   600..................................................   650..................................................   700..................................................   750..................................................   800..................................................   850..................................................   900..................................................   950..................................................  1000Linear regression                               Number of obs      =       128                                                Replications       =      1000      command:  regress gas income price price2 priceinc           pr:  -1*(1+_b[price]+_b[priceinc]*ln(15))/(2*_b[price2])------------------------------------------------------------------------------             |   Observed   Bootstrap                         Normal-based             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]-------------+----------------------------------------------------------------          pr |  -14.13797   270.9505    -0.05   0.958    -545.1913    516.9153------------------------------------------------------------------------------
In the command above, you use set seed 1 to assure reproducibility of your results. It means that, instead of generating a different initial random number every time you run the bootstrap, STATA will use the same seed, i.e., same initial random number (equals 1 in this case), for the iteration process.  The first expression in quotation marks is the estimator, while the second quoted expression is the estimate you are interested.  reps (10000) means that STATA will repeat the process 10000 times, to give  you the confidence interval.

Recall the generated statistic is in log form. A good question is whether you should present your Delta-method and Bootstrap confidence interval in logs or in levels... Think about it.

# Appendix A: Partial Residual Plot Revisited

It is important to mention that all results presented here are based on a different data set (auto2.dta) than the data set used on the problem set 2 (gasnew.dat):

1. Use the partial residual plot to check on the effect of the quadratic term. To obtain the partial residual plot with respect to the quadratic term you should :

1.1) Estimate model (2) without the regressor price2, and call this model (2.1)

    quietly: regress gas income price priceinc

1.2) Obtain the residuals of the model (2.1):

    predict gasres, resid 

1.3) Estimate model (2) using price2 instead of gas as the dependent variable; call it model (2.2):

    quietly: regress price2 income price priceinc 

1.4) Obtain the residuals of the model (2.2):

    predict pric2res, resid 

1.5) Run the Gauss-Frisch-Waugh “regression”, and check if the slope coefficient is the same as in the original model (2):

    regress gasres pric2res
      Source |       SS       df       MS              Number of obs =     128-------------+------------------------------           F(  1,   126) =    2.88       Model |  .008694019     1  .008694019           Prob > F      =  0.0921    Residual |  .380276321   126  .003018066           R-squared     =  0.0224-------------+------------------------------           Adj R-squared =  0.0146       Total |   .38897034   127  .003062759           Root MSE      =  .05494------------------------------------------------------------------------------      gasres |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]-------------+----------------------------------------------------------------    pric2res |   .2839261   .1672859     1.70   0.092    -.0471278    .6149801       _cons |   2.57e-10   .0048558     0.00   1.000    -.0096095    .0096095------------------------------------------------------------------------------


1.6) Plot the the partial residuals, with a fitting line of predicted values

 twoway (scatter gasres pric2res) (lfit gasres pric2res), title(Partial Residuals)

1. Check whether there is a linear relationship between the residuals of the model (2.1) and the residuals of model (2.2). Draw your conclusion from what you see in the graph, and try to justify your answer in the light of basic assumptions of linear regression.

1. Please send comments to bottan2@illinois.edu or srmntbr2@illinois.edu?

<!-- dynamically load mathjax for compatibility with --self-contained -->