# The Estimated Coefficient on Career Runs and Games Played Question

Description

1 attachmentsSlide 1 of 1attachment_1attachment_1.slider-slide > img { width: 100%; display: block; }
.slider-slide > img:focus { margin: auto; }

Unformatted Attachment Preview

Take Home Assignment
There are 40 multiple choice questions, worth 2.5 points each.
Please fill out the scantron that has been uploaded separately in this assignment.
 Note that #1 on the scantron begins midway down the first column
 Just put your name on the scantron; ID not needed
You must show work for any problem that requires it.
 You can do this next to or under a problem in this document or all together on a
separate sheet.
as one file.
You may work together. However, you must turn in your own scantron and your own
written work.
1
Use the following output for questions 1-4.
An econometrician performs the following regression:
lsalary = ? 0 + ?1 games + ? 2 runs + u
where lsalary
= natural log salary of major league baseball player
games
= career games played
runs
= career runs scored
Source |
SS
df
MS
————-+—————————–Model | 220.933197
2 110.466598
Residual | 271.242338
350
.77497811
————-+—————————–Total | 492.175535
352 1.39822595
Number of obs
F( 2,
350)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
353
142.54
0.0000
0.4489
0.4457
.88033
—————————————————————————–lsalary |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————games |
.0005747
.0002504
2.29
0.022
.0000822
.0010672
runs |
.0016433
.0004482
3.67
0.000
.0007619
.0025247
_cons |
12.64234
.0766381
164.96
0.000
12.49161
12.79307
——————————————————————————
1. All else equal, an increase of 100 games played in a career has what effect on salary?
A. Increases salary by about 0.057%.
B. Increases salary by about \$5,700.
C. Has no effect on salary.
D. Increases salary by about 5.7%.
2. Since career games played is included in the regression, the estimated coefficient on career runs is really measuring
A. how an increase in games played increases salary
B. how an increase in average runs per game played increases salary
C. how an increase in games played and runs per game increases salary
D. how an increase in the correlation between games played and runs per game increases salary
3. If games played had been left out of the regression, then the coefficient on runs would likely
be _________
A. biased upward
B. biased downward
C. unbiased
D. statistically insignificant
4. Based on the p-value associated with games, we know that
A. ?2 is NOT significantly different from 0 at the 1% level
B. ?1 is significantly different from 0 at the 1% level
C. ?1 is NOT significantly different from 0 at the 5% level
D. ?1 is significantly different from 0 at the 5% level
2
5. Manager Mark wants to find baseball players that are undervalued by the market (not being
paid their true worth). He believes that walks (walks) and extra bases (xtrabases) determine
wins. Which of the following strategies will help him find the players he wants after estimating
salary =?0 + ?1 walks + ?2 xtrabases + u?
A. Use the estimated model to predict each players residual, uhat. Players with negative uhat are undervalued.
B. Use the estimated model to predict each players residual, uhat. Players with positive uhat are undervalued.
C. Use the estimated model to predict each players salary, yhat. Players with small
yhat are undervalued.
D. Use the estimated model to predict each players salary, yhat. Players with large
yhat are undervalued.
Use the following model for questions 6-7. We estimate the following model for average standardized test scores of 5th graders in California school districts:
testscr = ?0 + ?1 avginc + ?2 avgincsq + u
where avginc is average household income in the district and avgincsq is avginc2.
. reg testscr avginc avgincsq
Source |
SS
df
MS
————-+—————————–Model | 84599.2786
2 42299.6393
Residual | 67510.3151
417
161.89524
————-+—————————–Total | 152109.594
419 363.030056
Number of obs
F( 2,
417)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
420
261.28
0.0000
0.5562
0.5540
12.724
—————————————————————————–testscr |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————avginc |
3.850995
.3042617
12.66
0.000
3.252917
4.449073
avgincsq | -.0423085
.0062601
-6.76
0.000
-.0546137
-.0300033
_cons |
607.3017
3.046219
199.36
0.000
601.3139
613.2896
6. What is the marginal effect of average income on test scores?
A. 3.85
B. 3.85 – .0423avginc
C. 0
D. 3.85 – .0846avginc
7. At approximately what value of average income are test scores maximized?
A. 36
B. 45.5
C. 64
D. 85.5
E. none of the above
3
8. A researcher interested in studying whether women trade off between time at work and
time sleeping gets the following estimated equation:
predicted weekly hours of sleep = ?? – ????weekly hours of work – ????education
If education were not included in the model, the size of the change in the estimated coefficient on hours depends upon
A. the amount of heteroskedasticity in the model
B. the amount of correlation between hours of work and education
C. the statistical significance of education
D. the sample size
9. Using data on 4,137 college students at a midsize research university, the following equation was estimated using OLS:
where colgpa is measured on a four-point scale, hsperc is the percentile in the high school
graduating class (so that, for example, hsperc = 5 corresponds to the top 5% of the class), and
sat is the combined math and verbal scores on the student achievement test.
Suppose that two high school graduates, A and B, graduated in the same percentile from high
school, but student As SAT score was 180 points higher. What is the predicted difference in
college GPA for these two students?
A. Student As GPA will be .572 points higher
B. Student As GPA will be .2664 points higher
C. Student Bs GPA will be .424 points higher
D. Student Bs GPA will be .3522 points higher
E. none of the above
10. Based on the following estimated wage equation, how would you interpret the coefficient
on college? The possible educational categories are: less than high school, high school grad,
ln(w age) = 1.1 + . 09highscho ol + .24somecollege + . 54college + .72graduate
+ .07exp – . 004exp 2 + .10male
A. Those with a college degree earn about 54% more than those who did not graduate
high school.
B. Those with a college degree earn about 54% more than high school graduates.
C. Those with a college degree earn about \$0.54 per hour more than those who did not
D. Those with a college degree earn about \$54,000 more than those who did not graduate high school.
4
Use the following output for questions 11-13.
Suppose a researcher estimates a wage model, and gets the following STATA results, where:
lwage
= log wages
educ
= years of education
exper
= years of experience
female
= 1 if individual is female, 0 otherwise.
Source |
SS
df
MS
————-+—————————–Model |
52.29391
3 17.4313033
Residual | 96.0358517
522 .183976727
————-+—————————–Total | 148.329762
525
.28253288
Number of obs
F( 3,
522)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
526
94.75
0.0000
.42893
—————————————————————————–lwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————educ |
.0912897
12.82
.0772962
.1052833
exper |
.0094139
.0014493
0.000
.0065667
.012261
female | -.3435967
.0376668
0.000
-.4175939
-.2695995
_cons |
.4808356
.1050163
4.58
.2745292
.6871421
——————————————————————————
11. How much of the variation in the log of wages is explained by the variables in the model?
A. 0.2488
B. 0.3526
C. 0.4554
D. 0.5446
E. 0.6475
12. Is the coefficient (true beta) on female significantly different from 0 at the 5% level?
A. yes
B. no
C. not enough information to tell
13. What is the standard error for the estimated coefficient on educ?
A. 0.0002
B. 0.0071
C. 0.0145
D. 0.273
E. none of the above
14. All else constant, which of the following factors would decrease the estimated variance of
a beta-hat?
A. multicollinearity
B. a decrease in sample size
C. an increase in sample size
D. removing relevant variables from the regression
5
Use the following scenario for questions 15-16.
A researcher is interested in the determinants of sentencing, and using a random sample of
people convicted of homicide obtains the following, where priyears is prison sentence in years,
prviolnu is the number of prior violent offenses, stranger = 1 if the victim was a stranger,
black=1 if the offender is black, and vblack=1 if the victim was black.
—————————————————————————–priyears
| Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————prviolnu
| .5323433
.1477347
3.60
0.000
.2426575
.822029
stranger
| 1.801506
.5583621
3.23
0.001
.7066412
2.896372
black
| 1.70609
.7505269
2.27
0.023
.234418
3.177762
vblack
| -3.392937 .7427198
-4.57 0.000
-4.8493
-1.936573
_cons
| 9.536065
.4526042
21.07 0.000
8.648575
10.42355
——————————————————————————
15. Which factor decreases the predicted length of a prison sentence?
A. more prior violent offenses
B. the victim being a stranger
C. the offender being black
D. the victim being black
16. All else equal, how is the sentence of a black person who kills a white stranger (criminal A)
expected to compare to that of a white person who kills a black stranger (criminal B)?
A. criminal A is expected to be sentenced to 5.1 years more than criminal B
B. criminal A is expected to be sentenced to 1.7 years less than criminal B
C. criminal A is expected to be sentenced to 3.4 years more than criminal B
D. criminal A is expected to be sentenced to 1.7 years more than criminal B
17. Suppose we are interested in measuring the direct impact of productivity (measured as
wage) on the amount of exercise done in a week. However, we also suspect that exercise affects productivity. We would be worried that the following problem is occurring:
A. Intervening variable
B. No correlation
C. Common response
D. Reverse causation
18. You are considering trying to estimate the effect of IQ on income using cross-sectional data on individuals. However, you realize that the variance in income might differ by IQ level.
Thus, unless you take action, your model is likely to suffer from
A. multicollinearity
B. bias
C. serial correlation
D. heteroskedasticity
6
19. All else equal, which model is more likely to be in need of robust standard errors?
Model A: price = ? 0 + ?1lotsize + ? 2 sqrft + ? 3bdrms + u
Model B: ln( price ) = ? 0 + ?1lotsize + ? 2 sqrft + ? 3bdrms + u
A. Model A.
B. Model B.
C. They are equally likely to need robust standard errors.
D. There is not enough information to make an educated guess.
20. Which of the following can cause OLS estimators to be biased?
A. Heteroskedasticity
B. Omitting an important variable that is correlated with an included independent variable
C. A correlation of .80 between two independent variables included in the model
D. Including an interaction term in the model
E. None of these will cause OLS to be biased.
21. Based on the following estimated model of touchdowns as a function of pass attempts and
completions, how will estimated touchdowns change if a quarterback makes 100 more pass
attempts, 40 of which are also completions?
TD s = .025 ? .031attempts + .116completions
A. Estimated touchdowns will not change
B. Estimated touchdowns will decrease by about 7.5
C. Estimated touchdowns will increase by about 8.32
D. Estimated touchdowns will increase by about 1.54
7
Use the following output for questions 22-24.
Suppose you are interested in estimating the probability that an individual smokes as a function of:
educ
=
years of schooling
age
=
age in years
restaurn
=
1 if state restaurant smoking restrictions exist
. reg smoke educ age restaurn
Source |
SS
df
MS
————-+—————————–Model |
7.2089102
3 2.40297007
Residual | 183.708066
803 .228777168
————-+—————————–Total | 190.916976
806 .236869698
Number of obs
F( 3,
803)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
807
10.50
0.0000
0.0378
0.0342
.47831
—————————————————————————–smoke |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————educ | -.0223266
.0056113
-3.98
0.000
-.0333411
-.0113121
age | -.0036405
.0010064
-3.62
0.000
-.005616
-.001665
restaurn | -.0990255
.0391505
-2.53
0.012
-.1758749
-.022176
_cons |
.8371168
.0893487
9.37
0.000
.6617322
1.012501
——————————————————————————
22. Based on the linear probability model, what is the estimated probability of smoking for a 30year old with 12 years of education in a state that DOES have smoking restrictions?
A. 0.2245
B. 0.3612
C. 0.4013
D. 0.5589
E. 0.6104
23. Based on the linear probability model, how would you interpret the estimated coefficient on
educ?
A. An additional year of education decreases smoking by 2.2%
B. A 1% increase in years of education increases the probability of smoking by 0.022.
C. A 1% increase in years of education decreases the probability of smoking by 0.22.
D. An additional year of education decreases the probability of smoking by 0.022.
24. Based on the linear probability model, how would you interpret the estimated coefficient on restaurn?
A. Living in a state with restaurant smoking restrictions decreases the probability of smoking by .099.
B. Living in a state with restaurant smoking restrictions decreases the probability of smoking by 9.9%.
C. An additional restaurant in a state decreases the probability of smoking by .099
D. An additional restaurant in a state decreases the probability of smoking by .099%
25. The process of Ordinary Least Squares (OLS) estimates the parameters of a linear regression
by
A. simultaneously minimizing the residual for each observation
B. maximizing the likelihood of observing the data we have given our model
C. minimizing the sum of squared residuals
D. minimizing the square root of the product of the residuals
8
26. When deciding whether or not to drop variables from a model,
A. the economic significance of the coefficients does not matter
B. the change in the R-squared should be considered
C. its important to know whether heteroskedasticity is present, even if robust standard errors are used
D. an F-test should be conducted if the variables are individually insignificant and possibly
correlated
E. all of the above
Use the following output for questions 27  29.
The below STATA output estimates the following model, with northeast being the region
base/excluded category.
log( wage) = ? 0 + ?1educ + ? 2 female + ? 3exper + ? 4 female * exper + ? 5 married +
? 6 northcen + ? 7 south + ? 8 west + u
. reg lwage educ female exper femexper married northcen south west
Source |
SS
df
MS
————-+—————————–Model | 58.4412743
8 7.30515929
Residual | 89.8884874
517 .173865546
————-+—————————–Total | 148.329762
525
.28253288
Number of obs
F( 8,
517)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
526
42.02
0.0000
0.3940
0.3846
.41697
—————————————————————————–lwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————educ |
.0888772
.0071271
12.47
0.000
.0748756
.1028788
female | -.1622576
.0592933
-2.74
0.006
-.2787431
-.0457721
exper |
.0127103
.0020453
6.21
0.000
.0086922
.0167283
femexper | -.0098447
.0027135
-3.63
0.000
-.0151755
-.0045139
married |
.1384326
.0405266
3.42
0.001
.0588155
.2180496
northcen | -.0917406
.0530091
-1.73
0.084
-.1958804
.0123992
south | -.0948155
.0494229
-1.92
0.056
-.1919099
.0022788
west |
.0648169
.058641
1.11
0.270
-.050387
.1800208
_cons |
.4072016
.1137366
3.58
0.000
.1837588
.6306444
——————————————————————————
27. Choose the correct interpretation of the variable west.
A. Someone living in the west makes 64% more than everyone else.
B. Someone living in the west makes 6.5% more than someone living in the northeast.
C. Someone living in the west makes \$6,400 more than someone living in the northeast.
D. Someone living in the west makes \$0.64 more than everyone else.
28. An expression for the return to experience (effect of experience on log(wages)) is:
A. .0127 – .0098female
B. .0127
C. .0127 – .0098exper
D. .003exper
A. The return to experience is higher for women, compared to men.
B. The return to experience is lower for women, compared to men.
9
30. Suppose we would like to estimate the impact of a cigarette tax on lung cancer incidence. Our
hypothesis is that cigarette taxes will reduce consumption of cigarettes, thereby reducing
secondhand smoke and eventually lung cancer. In a model with lung cancer incidence as the dependent variable, should we include the level of cigarette consumption in order to more accurately
estimate the effect of the tax on lung cancer incidence?
A. Yes
B. No
C. Only if it increases the R-squared.
D. Its impossible to say without conducting an F-test.
Suppose an econometrician wants to see how gender and marital status affect wages. Everyone
can be categorized into one of four groups (married male, single male, married female, single female).
wages = ?0 + ?1 femmar + ?2 malemar + ?3 malesing + u
where femmar = 1 if female and married, 0 otherwise
malemar = 1 if male and married, 0 otherwise
malesing = 1 if male and not married, 0 otherwise
31. ?1 measures:
A. how much a married female earns compared to everyone else in the sample.
B. how much a married female earns compared to a single female.
C. how much a married female earns compared to a single male.
D. how much a married female earns compared to a married male.
Use the following information to answer questions 32-34. As an alternative to the above model, I
estimate the following model.
wages = ?0 + ?1 female + ?2 married + ?3 femmar + u
where femmar = female * married
female = 1 if female, 0 otherwise
married = 1 if married, 0 otherwise
32. Which of the following represents the expected wages for a married female?
A. ?0 + ?1
B. ?0 +?1 + ?2
C. ?0 +?2 + ?3
D. ?0 +?1 + ?2 + ?3
33. Which of the following is equivalent to ?1 in the above model (#31)?
A. ?1
B. ?1 + ?2
C. ?2 + ?3
D. ?1 + ?2 + ?3
10
34. The null hypothesis to test whether or not the effect of being married varies by gender is
A. Ho: ?1 = 0, ?3 = 0
B. Ho: ?3 = 0
C. Ho: ?1 = ?2
D. Ho: ?2 = 0, ?3 = 0
Use the following information for questions 35-37. We estimate a model of the natural logarithm of
salary for baseball players as a function of years played in the league, average games played per
year, batting average, home runs per year, and runs batted in per year.
ln( salary ) = ? 0 + ?1 years + ? 2 gamesyr + ? 3bavg + ? 4 hrunsyr + ? 5 rbisyr + u
. reg
lsalary years gamesyr bavg hrunsyr rbisyr
Source |
SS
df
MS
————-+—————————–Model | 308.989208
5 61.7978416
Residual | 183.186327
347 .527914487
————-+—————————–Total | 492.175535
352 1.39822595
Number of obs
F( 5,
347)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
353
117.06
0.0000
0.6278
0.6224
.72658
—————————————————————————–lsalary |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————years |
.0688626
.0121145
5.68
0.000
.0450355
.0926898
gamesyr |
.0125521
.0026468
4.74
0.000
.0073464
.0177578
bavg |
.0009786
.0011035
0.89
0.376
-.0011918
.003149
hrunsyr |
.0144295
.016057
0.90
0.369
-.0171518
.0460107
rbisyr |
.0107657
.007175
1.50
0.134
-.0033462
.0248776
_cons |
11.19242
.2888229
38.75
0.000
10.62435
11.76048
—————————————————————————–. test bavg hrunsyr rbisyr
( 1)
( 2)
( 3)
bavg = 0
hrunsyr = 0
rbisyr = 0
F(
3,
347) =
Prob > F =
9.55
0.0000
35. Consider the following null and alternative hypothesis:
H 0 : ?5 = 0
H1 : ?5 ? 0
Would you reject the null at the 20% level of significance?_________
Would you reject the null at the 5% level of significance?___________
Would you reject the null at the 1% level of significance?__________
A. yes; yes; yes
B. no; no; no
C. yes; no; no
D. no; yes; yes
36. What is the estimated variance of ?5 ?
A. 0.007175
B. 0.0000515
C. 0.0121
D. 0.00265
E. none of the above
11
37. What do you conclude when you test the following null hypothesis?
H 0 : ?3 = 0, ? 4 = 0, ?5 = 0
A. The parameters are jointly significant at the 10% level, but not at a lower significance
level.
B. The parameters are individually significant at less than the 1% level.
C. The parameters are not jointly significant at any reasonable level.
D. The parameters are jointly significant at less than the 1% level.
38. A researcher is planning on estimating a simple model of per-pupil school spending as a function of the districts per capita income. He obtains the following, where lnexppup is ln(per pupil
spending) and lnpcy is ln(per capita income).
predicted lnexppup = 5.41 + 2.14lnpcy
These results imply that
A. increasing per capita income by \$100 increases spending by \$21.40
B. increasing per capita income by \$100 increases spending by 21%
C. increasing per capita income by 100% increases spending by \$21.40
D. increasing per capita income by 1% increases spending by 2.14%
Use this information for questions 39-40. An econometrician estimates the following model:
colgpa = ?0 + ?1sat + ?2tothrs + ?3sathours + ?4female + u
where colgpa is college GPA, sat is SAT score, tothrs is the number of credit hours accumulated prior to the semester, sathours is an interaction of sat and tothrs, and female is a dummy variable = 1 if
the student is female.
. reg colgpa sat tothrs sathours female
Source |
SS
df
MS
————-+—————————–Model | 398.539145
4 99.6347864
Residual | 1395.65653 4132 .337767795
————-+—————————–Total | 1794.19567 4136 .433799728
Number of obs
F( 4, 4132)
Prob > F
R-squared
Root MSE
=
=
=
=
=
=
4137
294.98
0.0000
0.2221
0.2214
.58118
—————————————————————————–colgpa |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
————-+—————————————————————sat |
.0027439
.0001184
23.17
0.000
.0025117
.002976
tothrs |
.0155283
.0018969
8.19
0.000
.0118095
.0192472
sathours | -.0000128
1.83e-06
-7.00
0.000
-.0000164
-9.24e-06
female |
.2248177
.0184042
12.22
0.000
.1887355
.2608998
_cons | -.3971054
.1236172
-3.21
0.001
-.6394617
-.1547492
——————————————————————————
39. What is true about the effect of SAT score on college GPA?
A. An increase in SAT of one point increases college GPA by .0027439
B. SAT does not have a statistically significant impact on college GPA
C. For those who have more accumulated credit hours, the impact of SAT on GPA is smaller
D. For those who have more accumulated credit hours, the impact of SAT on GPA is larger
E. None of the above
12
40. If 90 hours are accumulated, the effect of SAT on GPA is approximately:
A. 0.0027
B. 0.0016
C. -0.0000128
D. 0.016
E. none of the above
13