| Welcome.
This time we focus on censored regression models. The idea is to apply
the main concepts on Tobit models to the problem set 5. As usual, the main
reference is Prof. Koenker's Lecture Notes. 
 Data
 You can download your data
from the Econ 508 web page (here)
and save the file in your preferred directory (I'll save mine as "C:\weco.dat").
Then you open STATA and type:
 infile 
y sex dex lex kwit tenure censored using "C:\weco.dat"
 Drop the first line
of the data set containing missing values (due to the labels of variables).
 Next you generate the variable
lex squared:
gen
lex2=lex^2
 Then save the file in STATA
format (I'll save mine as "C:\weco.dta").
 I will use a subsample
of the data set to demonstrate how to obtain the main results. My subsample
contains only 257 observations, obtained from dropping lex==12. My results
may differ from the original data set in PS5.
 Question 4:
 Part (a):
(For my subsample) 
 You need to use the Heckman
two-step procedure to estimate the equation of productivity, using only
non-quitters. For my subsample, the results will be as follows:
 gen
nonkwit=0
replace
nonkwit=1 if kwit==0
 probit 
nonkwit sex dex lex lex2
 Iteration
0:   log likelihood = -150.50058
Iteration
1:   log likelihood = -140.37588
 Iteration
2:   log likelihood = -140.30471
 Iteration
3:   log likelihood =  -140.3047
 Probit
estimates                                 
Number of obs   =        257
 LR chi2(4)      =     
20.39
 Prob > chi2     =     0.0004
 Log
likelihood =  -140.3047                      
Pseudo R2       =    
0.0677
 ------------------------------------------------------------------------------
 nonkwit
|      Coef.   Std. Err.      
z     P>|z|       [95%
Conf. Interval]
 ---------+--------------------------------------------------------------------
 sex |  -.4195276   .1770855     -2.369  
0.018      -.7666088   -.0724464
 dex |   .0447744   .0118068     
3.792   0.000       .0216336   
.0679153
 lex |   .3840883   .4376635     
0.878   0.380      -.4737163   
1.241893
 lex2 |  -.0149982   .0176316     -0.851  
0.395      -.0495555     .019559
 _cons |  -3.528815   2.739516     -1.288  
0.198      -8.898168    1.840538
 ------------------------------------------------------------------------------
 predict
xb, xb
gen
smallphi=normd(xb)
 gen
largephi=normprob(xb)
 gen
lambda=smallphi/largephi
 reg
y sex dex lex lex2 lambda if nonkwit==1
  
Source |       SS      
df       MS                 
Number of obs =     187
---------+------------------------------              
F(  5,   181) =   27.21
 Model |  171.375875     5  34.2751749              
Prob > F      =  0.0000
 Residual
|  228.009406   181  1.25972047              
R-squared     =  0.4291
 ---------+------------------------------              
Adj R-squared =  0.4133
 Total |   399.38528   186  2.14723269              
Root MSE      =  1.1224
 ------------------------------------------------------------------------------
y |      Coef.   Std. Err.      
t     P>|t|       [95%
Conf. Interval]
 ---------+--------------------------------------------------------------------
 sex |  -.7407524   .7180634     -1.032  
0.304      -2.157604    .6760995
 dex |   .0886988   .0750953     
1.181   0.239      -.0594761   
.2368737
 lex |   .2145177   .7554096     
0.284   0.777      -1.276024   
1.705059
 lex2 |  -.0123412   .0298319     -0.414  
0.680      -.0712041    .0465217
 lambda |  -1.702168     3.7428    
-0.455   0.650      -9.087299   
5.682964
 _cons |   11.01176   8.921278     
1.234   0.219       -6.59132   
28.61484
 ------------------------------------------------------------------------------
 Then you can test for sample
selectivity problems by checking the significance of lambda, as remarked
in Lecture 17. Please indicate what model you should use after all, based
on the sample selectivity test.
 Part (b):
(For my subsample)
 The results from applying
OLS to the full subsample are, as in question 1:
 reg 
y sex dex lex lex2
  
Source |       SS      
df       MS                 
Number of obs =     257
---------+------------------------------              
F(  4,   252) =   62.24
 Model |  286.582591     4  71.6456477              
Prob > F      =  0.0000
 Residual
|  290.064441   252  1.15104937              
R-squared     =  0.4970
 ---------+------------------------------              
Adj R-squared =  0.4890
 Total |  576.647032   256  2.25252747              
Root MSE      =  1.0729
 ------------------------------------------------------------------------------
y |      Coef.   Std. Err.      
t     P>|t|       [95%
Conf. Interval]
 ---------+--------------------------------------------------------------------
 sex |  -1.039604   .1357226     -7.660  
0.000        -1.3069   -.7723093
 dex |   .1248959   .0088647    
14.089   0.000       .1074376   
.1423542
 lex |   .6572255   .3479552     
1.889   0.060      -.0280453   
1.342496
 lex2 |  -.0292285    .013988    
-2.090   0.038      -.0567767  
-.0016802
 _cons |   5.952681   2.172383     
2.740   0.007       1.674341   
10.23102
 ------------------------------------------------------------------------------
 And the results from applying
the naive OLS to the restricted subsample of non-quitters are:
 reg 
y sex dex lex lex2 if nonkwit==1
  
Source |       SS      
df       MS                 
Number of obs =     187
---------+------------------------------              
F(  4,   182) =   34.11
 Model |  171.115328     4   42.778832              
Prob > F      =  0.0000
 Residual
|  228.269953   182  1.25423051              
R-squared     =  0.4284
 ---------+------------------------------              
Adj R-squared =  0.4159
 Total |   399.38528   186  2.14723269              
Root MSE      =  1.1199
 ------------------------------------------------------------------------------
y |      Coef.   Std. Err.      
t     P>|t|       [95%
Conf. Interval]
 ---------+--------------------------------------------------------------------
 sex |  -1.058262   .1675522     -6.316  
0.000      -1.388856   -.7276673
 dex |   .1224521   .0114198    
10.723   0.000       .0999198   
.1449843
 lex |   .4962791    .431264     
1.151   0.251       -.354641   
1.347199
 lex2 |   -.023397   .0172528    
-1.356   0.177      -.0574383   
.0106442
 _cons |   7.149452   2.726205     
2.622   0.009       1.770421   
12.52848
 ------------------------------------------------------------------------------
 What can you conclude from
them? Is there sample selectivity problem? Why?
 |