-------------------------------------------------------------------------------- name: log: H:\Documents and Settings\Lee\My Documents\Document\stata\E6.2.log log type: text opened on: 26 Oct 2009, 11:45:14 Part a. . . reg ed dist, vce(hc3) . scalar r2_a1 = e(r2_a) Linear regression Number of obs = 3796 F( 1, 3794) = 29.58 Prob > F = 0.0000 R-squared = 0.0074 Root MSE = 1.8074 ------------------------------------------------------------------------------ | Robust HC3 ed | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- dist | -.0733727 .0134906 -5.44 0.000 -.0998222 -.0469232 _cons | 13.95586 .0378502 368.71 0.000 13.88165 14.03006 ------------------------------------------------------------------------------ The estimated slope is -.07338. This indicates that 1 more mile away from campus reduces years of schooling by .07338 years. Part b. . . reg ed dist bytest female black hispanic incomehi ownhome dadcoll cue80 stwmf > g80, vce(hc3) . scalar r2_a2= e(r2_a) Linear regression Number of obs = 3796 F( 10, 3785) = 197.05 Prob > F = 0.0000 R-squared = 0.2788 Root MSE = 1.5425 ------------------------------------------------------------------------------ | Robust HC3 ed | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- dist | -.0315387 .0117149 -2.69 0.007 -.0545069 -.0085706 bytest | .0938201 .0029854 31.43 0.000 .087967 .0996732 female | .145408 .0504617 2.88 0.004 .0464732 .2443427 black | .367971 .0676552 5.44 0.000 .2353268 .5006152 hispanic | .3985196 .0740377 5.38 0.000 .2533619 .5436773 incomehi | .3951984 .0620214 6.37 0.000 .2735999 .516797 ownhome | .1521313 .0650556 2.34 0.019 .0245838 .2796788 dadcoll | .6961324 .0708905 9.82 0.000 .5571451 .8351196 cue80 | .0232052 .0093368 2.49 0.013 .0048996 .0415108 stwmfg80 | -.0517777 .0197081 -2.63 0.009 -.0904172 -.0131382 _cons | 8.827518 .2417556 36.51 0.000 8.353534 9.301502 ------------------------------------------------------------------------------ The estimated slope is -.03154. This indicates that 1 more mile away from campus reduces years of schooling by .03154 years, holding bytest, female, black, hispanic, incomehi, ownhome, dadcoll, cue80, and stwmfg80 constant. Part c. This is less than half the value from the simple regression and indicates a possible omitted variable bias in part a. Part d. The R2 increases substantially when additional regressors are added. This is to be expected since adding regressors that have any correlation with ed will make the regression sum of squares smaller and R2 larger. Since I used robust standard errors, the adjusted R2 is not printed. They are computed and I saved them. They are printed below: . scalar list r2_a1 r2_a2 r2_a1 = .00718796 r2_a2 = .27693235 . The r2 and adjusted r2 in b are very similar because the sample size is very large (3796). Thus, (n-1)/(n-k) is very close to 1 (e.g., 1.002642). See the computation below. . di (e(N)-1)/e(df_r) 1.002642 Part e. The coefficient on DadColl measures the effect of dad's college attendance on the child's education, holding all other variables constant. Since it is a dummy variable, the difference will be the same no matter what the distance is. It indicates that if Dad went to college, the child will tend to get more schooling. Part f. Cue80 controls for local unemployment. High unemployment reduces the opportunity cost of college attendance. We expect its coefficient to be positive. Higher local wages increases the cost of college and we expect the sign on this to be negative. The coefficients are significantly different from zero, but small in magnitude. 1 percentage point unemployment increases schooling by .02 year. Another dollar per hour reduces average schooling by .05 years. However, these are roughly the size of the distance variable's coefficient. . . set obs 3797 obs was 3796, now 3797 . replace female = 0 in 3797 (1 real change made) . replace black = 1 in 3797 (1 real change made) . replace hispanic = 0 in 3797 (1 real change made) . replace dadcoll = 0 in 3797 (1 real change made) . replace momcoll = 1 in 3797 (1 real change made) . replace cue80 = 7.5 in 3797 (1 real change made) . replace stwmfg80 = 9.75 in 3797 (1 real change made) . replace bytest = 58 in 3797 (1 real change made) . replace dist = 2 in 3797 (1 real change made) . replace incomehi = 1 in 3797 (1 real change made) . replace ownhome = 1 in 3797 (1 real change made) . . predict yhat, xb . . scalar bob = yhat[3797] . scalar jim = bob + 2*_b[dist] Parts g and h. The easiest way to do this is to add a variable to the dataset and predict the schooling based of these characteristics. Hence, the series of replace commands. . scalar list bob jim bob = 14.790515 jim = 14.727437 Jim lives 20 miles further away therefore his schooling will get 2 * (-.0315387) as much as Bob's. . log close name: log: H:\Documents and Settings\Lee\My Documents\Document\stata\E6.2.log log type: text closed on: 26 Oct 2009, 11:45:14 --------------------------------------------------------------------------------