Welcome to a new issue of e-Tutorial. This e-TA will focus on Censored Regression Models, with special emphasis on helping answer question 4 of PS5. ¹

Data

You can download the data set, called weco14.csv from the Econ 508 web site. Save it in your preferred directory.

See the first section of e-TA 13 on Cubic B-Splines and Quantile Regression for description on preparing the data and saving it in Stata format.

   use weco14.dta, clear

Heckman two-step procedure

To estimate the equation of productivity, using only non-quitters. To do so we need to use the Heckman two-step procedure following Lecture 21. But first we need to crate a dummy variable that identifies non quitters, and run a probit regression:

   gen lex2 = lex^2
   gen nonkwit = (kwit == 0)
   list in 1/5

     +---------------------------------------------------------------------------------+
     |     y   sex   dex   lex   kwit   job_te~e   status   treatm~t   ypost   nonkwit |
     |---------------------------------------------------------------------------------|
  1. | 13.73     0    38    10      0        277        1          1   14.35         1 |
  2. | 17.15     1    55    11      1        173        1          .       .         0 |
  3. | 13.63     1    45    12      0        410        1          1   15.75         1 |
  4. | 13.04     1    41    11      0        247        1          0   18.33         1 |
  5. |  13.2     1    42    10      0        340        1          0   13.96         1 |
     +---------------------------------------------------------------------------------+

After we have all the variables we follow the “recipe” in Lecture 21

Estimate binary choice model by probit

   probit nonkwit sex dex lex lex2


Iteration 0:   log likelihood = -372.98741  
Iteration 1:   log likelihood = -339.87696  
Iteration 2:   log likelihood = -339.69113  
Iteration 3:   log likelihood = -339.69113  

Probit regression                                 Number of obs   =        683
                                                  LR chi2(4)      =      66.59
                                                  Prob > chi2     =     0.0000
Log likelihood = -339.69113                       Pseudo R2       =     0.0893

------------------------------------------------------------------------------
     nonkwit |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         sex |  -.2715531    .113097    -2.40   0.016    -.4932191   -.0498871
         dex |   .0580076   .0082217     7.06   0.000     .0418934    .0741219
         lex |   1.155319   .3942607     2.93   0.003     .3825825    1.928056
        lex2 |  -.0470172   .0157425    -2.99   0.003     -.077872   -.0161624
       _cons |  -8.698287   2.499655    -3.48   0.001    -13.59752   -3.799054
------------------------------------------------------------------------------

Construct \(\hat{\lambda_i} = \frac{\phi(x'_i\gamma)}{\Phi(x'_i\gamma)}\)

   predict xb, xb 
   gen smallphi=normalden(xb) 
   gen largephi=normprob(xb) 
   gen lambda=smallphi/largephi

Re estimate original model using only \(y_i > 0\) observations but including \(\hat{\lambda_i}\) as additional explanatory variable

   reg y sex dex lex lex2 lambda if nonkwit==1


      Source |       SS       df       MS              Number of obs =     522
-------------+------------------------------           F(  5,   516) =   55.86
       Model |  371.793456     5  74.3586912           Prob > F      =  0.0000
    Residual |   686.84747   516  1.33109975           R-squared     =  0.3512
-------------+------------------------------           Adj R-squared =  0.3449
       Total |  1058.64093   521  2.03194036           Root MSE      =  1.1537

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         sex |  -.6917229    .215062    -3.22   0.001    -1.114228    -.269218
         dex |   .0742407   .0392891     1.89   0.059    -.0029455    .1514269
         lex |  -.0929801   .9821782    -0.09   0.925     -2.02254     1.83658
        lex2 |   .0038033   .0400168     0.10   0.924    -.0748125    .0824192
      lambda |  -1.622262   1.652698    -0.98   0.327    -4.869106    1.624581
       _cons |    12.9142    8.05954     1.60   0.110    -2.919347    28.74774
------------------------------------------------------------------------------

Then you can test for sample selectivity problems by checking the significance of \(\hat{\lambda_i}\), as remarked in Lecture 21. Please indicate what model you should use after all, based on the sample selectivity test.

Powell’s estimator

As pointed out by Lecture 21 a problem with the Gaussian MLE is that it can perform poorly in non-Gaussian and/or heteroscedastic circumstances. In that case we could use Powell estimator which can be implemented in Stata by using the clad function. To download the function write findit clad, select sg153 and click install. They syntax is

   clad depvar indepvars, reps(#) [ll(#) or ul(#)]

where in reps(#) you specify the number of iterations for the bootstrap, then you must specify ll(#) if the censoring is at the bottom of the distribution (and place the value at which censoring occurs) or ul(#) if upper censored.

Note that under certain conditions this works for any \(F\) even if there is heteroskedasticity.

Please send comments to bottan2@illinois.edu or srmntbr2@illinois.edu ↩

Contact	Office Hours	E-mail
Prof. Roger Koenker	M. & W. 2:30-3:30 or by appointment (126 DKH)	rkoenker@illinois.edu
TA Nicolas Bottan	TBA	bottan2@illinois.edu

Applied Econometrics
Econ 508 - Fall 2014

Professor: Roger Koenker

TA: Nicolas Bottan

e-TA 15: Censored Regression Models

Data

Heckman two-step procedure

Powell’s estimator

Contact

Office Hours

E-mail

Applied Econometrics Econ 508 - Fall 2014

Professor: Roger Koenker

TA: Nicolas Bottan

e-TA 15: Censored Regression Models

Data

Heckman two-step procedure

Powell’s estimator

Contact

Office Hours

E-mail

Applied Econometrics
Econ 508 - Fall 2014