Welcome.
This time we focus on censored regression models. The idea is to apply
the main concepts on Tobit models to the problem set 5. As usual, the main
reference is Prof. Koenker's Lecture Notes.
Data
You can download your data
from the Econ 508 web page (here)
and save the file in your preferred directory (I'll save mine as "C:\weco.dat").
Then you open STATA and type:
infile
y sex dex lex kwit tenure censored using "C:\weco.dat"
Drop the first line
of the data set containing missing values (due to the labels of variables).
Next you generate the variable
lex squared:
gen
lex2=lex^2
Then save the file in STATA
format (I'll save mine as "C:\weco.dta").
I will use a subsample
of the data set to demonstrate how to obtain the main results. My subsample
contains only 257 observations, obtained from dropping lex==12. My results
may differ from the original data set in PS5.
Question 4:
Part (a):
(For my subsample)
You need to use the Heckman
two-step procedure to estimate the equation of productivity, using only
non-quitters. For my subsample, the results will be as follows:
gen
nonkwit=0
replace
nonkwit=1 if kwit==0
probit
nonkwit sex dex lex lex2
Iteration
0: log likelihood = -150.50058
Iteration
1: log likelihood = -140.37588
Iteration
2: log likelihood = -140.30471
Iteration
3: log likelihood = -140.3047
Probit
estimates
Number of obs = 257
LR chi2(4) =
20.39
Prob > chi2 = 0.0004
Log
likelihood = -140.3047
Pseudo R2 =
0.0677
------------------------------------------------------------------------------
nonkwit
| Coef. Std. Err.
z P>|z| [95%
Conf. Interval]
---------+--------------------------------------------------------------------
sex | -.4195276 .1770855 -2.369
0.018 -.7666088 -.0724464
dex | .0447744 .0118068
3.792 0.000 .0216336
.0679153
lex | .3840883 .4376635
0.878 0.380 -.4737163
1.241893
lex2 | -.0149982 .0176316 -0.851
0.395 -.0495555 .019559
_cons | -3.528815 2.739516 -1.288
0.198 -8.898168 1.840538
------------------------------------------------------------------------------
predict
xb, xb
gen
smallphi=normd(xb)
gen
largephi=normprob(xb)
gen
lambda=smallphi/largephi
reg
y sex dex lex lex2 lambda if nonkwit==1
Source | SS
df MS
Number of obs = 187
---------+------------------------------
F( 5, 181) = 27.21
Model | 171.375875 5 34.2751749
Prob > F = 0.0000
Residual
| 228.009406 181 1.25972047
R-squared = 0.4291
---------+------------------------------
Adj R-squared = 0.4133
Total | 399.38528 186 2.14723269
Root MSE = 1.1224
------------------------------------------------------------------------------
y | Coef. Std. Err.
t P>|t| [95%
Conf. Interval]
---------+--------------------------------------------------------------------
sex | -.7407524 .7180634 -1.032
0.304 -2.157604 .6760995
dex | .0886988 .0750953
1.181 0.239 -.0594761
.2368737
lex | .2145177 .7554096
0.284 0.777 -1.276024
1.705059
lex2 | -.0123412 .0298319 -0.414
0.680 -.0712041 .0465217
lambda | -1.702168 3.7428
-0.455 0.650 -9.087299
5.682964
_cons | 11.01176 8.921278
1.234 0.219 -6.59132
28.61484
------------------------------------------------------------------------------
Then you can test for sample
selectivity problems by checking the significance of lambda, as remarked
in Lecture 17. Please indicate what model you should use after all, based
on the sample selectivity test.
Part (b):
(For my subsample)
The results from applying
OLS to the full subsample are, as in question 1:
reg
y sex dex lex lex2
Source | SS
df MS
Number of obs = 257
---------+------------------------------
F( 4, 252) = 62.24
Model | 286.582591 4 71.6456477
Prob > F = 0.0000
Residual
| 290.064441 252 1.15104937
R-squared = 0.4970
---------+------------------------------
Adj R-squared = 0.4890
Total | 576.647032 256 2.25252747
Root MSE = 1.0729
------------------------------------------------------------------------------
y | Coef. Std. Err.
t P>|t| [95%
Conf. Interval]
---------+--------------------------------------------------------------------
sex | -1.039604 .1357226 -7.660
0.000 -1.3069 -.7723093
dex | .1248959 .0088647
14.089 0.000 .1074376
.1423542
lex | .6572255 .3479552
1.889 0.060 -.0280453
1.342496
lex2 | -.0292285 .013988
-2.090 0.038 -.0567767
-.0016802
_cons | 5.952681 2.172383
2.740 0.007 1.674341
10.23102
------------------------------------------------------------------------------
And the results from applying
the naive OLS to the restricted subsample of non-quitters are:
reg
y sex dex lex lex2 if nonkwit==1
Source | SS
df MS
Number of obs = 187
---------+------------------------------
F( 4, 182) = 34.11
Model | 171.115328 4 42.778832
Prob > F = 0.0000
Residual
| 228.269953 182 1.25423051
R-squared = 0.4284
---------+------------------------------
Adj R-squared = 0.4159
Total | 399.38528 186 2.14723269
Root MSE = 1.1199
------------------------------------------------------------------------------
y | Coef. Std. Err.
t P>|t| [95%
Conf. Interval]
---------+--------------------------------------------------------------------
sex | -1.058262 .1675522 -6.316
0.000 -1.388856 -.7276673
dex | .1224521 .0114198
10.723 0.000 .0999198
.1449843
lex | .4962791 .431264
1.151 0.251 -.354641
1.347199
lex2 | -.023397 .0172528
-1.356 0.177 -.0574383
.0106442
_cons | 7.149452 2.726205
2.622 0.009 1.770421
12.52848
------------------------------------------------------------------------------
What can you conclude from
them? Is there sample selectivity problem? Why?
|