# AF 6208 University of Pittsburgh Econometrics Problems

Description

3 attachmentsSlide 1 of 3attachment_1attachment_1attachment_2attachment_2attachment_3attachment_3.slider-slide > img { width: 100%; display: block; }
.slider-slide > img:focus { margin: auto; }

Unformatted Attachment Preview

Econometric Methods (AF6208)
Dr. Steven X.. Wei
Problem Set II [Due time and date: 11:00 pm on 17 April 2022]
Note: I design the problem set for three purposes: (1) your understanding of the lecture
notes; (2) reading research papers and working on your research projects/dissertation; and
(3) preparing for your take-home exam (which will include some similar types of
questions to those in the problem set). It is strongly suggested to work on the problem set
independently first, and then get together to discuss your solutions within your group.
Please stay in the same group as in Problem Set I. Each group submits ONE write-up,
Chapter 6: Q3
Chapter 7: Q9
Chapter 8: Q4
Chapter 15: Q1, Q4 and Q7.
Q1. Understand some Important Issues
(1) Suppose the true model is given by
y ? ? 0 ? ?1 x1 ? ? 2 x2 ? ?
while you estimate (a wrong model below)
y ? ? 0 ? ?1 x1 ? ?
a. Demonstrate the OLS estimate of ?1 is inconsistent in general. Explain in
which special situations, it can be still consistent.
b. What is the implication of the issue on your future research?
(2) Explain p-value when you conduct a t-test and F-test.
[Note that a p-value is closely related to your null hypothesis! If you change your
null hypothesis, then the p-value will be different too! In addition, for any
hypothesis testing, we can calculate its p-value! Think of why empirical analysts
prefer to report p-values in their reports or papers.]
(3) In your own research, if you have the issue of (conditional) heteroskedasticity in
your regression, what would you worry about  the estimated parameters and/or
Econometric Methods (AF6208)
Dr. Steven X.. Wei
Q2. A Case Study (Imagine you are working on a project/thesis)
You are asked to conduct an economic analysis in HK labor market. The major issue to
be asked is whether female is discriminated (in terms of salary) in the labor market. A
student randomly collected 1000 observations with 500 males and 500 females. She
simply calculated the monthly average salary, \$25000, for females and the salary average,
\$30000, for males. Then she concludes that females are discriminated in the labor market
(in terms of salary). [Hint: Use a dummy variable in your research design!]
student. [Note that this is quite a general problem in the real world!]
(2) If you were the student, how could you design your test to make a better inference?
(3) Would you like to include control variables in your analysis? Why?
(4) Are there any possible econometric (or economics) issues on which you have
concerns in your analysis? Explain your problem(s), if any, and propose your
solutions to the problem(s).
[This question gives you a rough and real situation on how the research in economics is
conducted, and how the econometric inferences play an important role in the process!
Hopefully, this helps you structure your independent studies and thesis.]
Q3. Understand Endogeneity:
(1) What is an endogeneity issue? Why should we care about it?
(2) Provide a few (three or more) economic situations in which an endogeneity issue
arises.
(3) What is an instrumental variable for an endogenous variable?
(4) What is the two-stage least square estimate of a parameter?
(5) Explain intuitively why using instrumental variables can (at least partially) solve
the endogenous problem.
(6) What are under-identified, exactly-identified and over-identified cases?
(7) Discuss and comment on the case of weak instrumental variables [Read the
textbook and explain it intuitively].
(8) Search in the internet or any other places, and then talk about the key ideas of
other ways (i..e, non IV method) of dealing with the endogeneity issue.
Q4. Suppose that we have two estimators,
?? n and ?? n
of parameters ? and ?, respectively, where n represents the sample size. Answer the
following questions:
If both of the estimators are consistent,
Econometric Methods (AF6208)
(1) would
Dr. Steven X.. Wei
?? n + ?? n be a consistent estimator of ? + ? ?
?? n be a consistent estimator of ? – ? ?
(2) would
?? n
(4) would
?? n * ?? n be a consistent estimator of ? * ? ?

? n be a consistent estimator of 2? ?
(3) would 2 ?
(5) would
?0) ?
?? n / ?? n be a consistent estimator of ? / ? (of course, we assume that
?
Can you summarize the law or rule here for your future references?
. This is the end of questions.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Lecture Note 03
Linear Regression Model:
Inference
AF6208: Econometrics Methods
Dr. Steven X. Wei
You may read Greens book Ch. 5 and Ch. 6, if you wish to learn this part
This section considers conducting a statistical inference with a multiple
linear regression model.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Economics/Any Other Field
Ask a good question: Forming a hypothesis
Econometric Model
Data
(observable)
(Constraints on the interested
Parameters)
Parameters
(non-observable)
Parameters
of Interest
Nuisance
Parameters
AF6208: Econometrics Methods
Dr. Steven X. Wei
Example 3.1. (Used as the Motivation)
Suppose you ask a sensitive question: Are females discriminated in
the labour market (in terms of salary)?
We frame it into an Econometric Model:
wage = ? 0 + ?1 female + ? 2 educ + ?
Data = {wage, female, educ}
(observable)
Parameters ? = {?0, ?1, ?2??2}
(Non-observable)
Parameter of interest is: ?1
constrain on the interested
parameter: ?1 < 0. AF6208: Econometrics Methods Dr. Steven X. Wei How to assess rightly whether a parameter ?1 < 0, given that it is unobservable? The main idea is roughly following: 1. Given the hypothesis (what you would like to assess), you need to find a vehicle (i.e., a test statistic) to assess if the hypothesis is rejected. 2. The assessment is a probability assessment, rather than 100% yes or no. Never say 100% yes or no, in statistics! To do so, we use the term level of significance to cook the statistical language. [I assume that you have learnt the basic terms before or you should read the part of Lecture Note 01 on hypothesis testing!] AF6208: Econometrics Methods Dr. Steven X. Wei 3. For the vehicle, i.e., the test statistic, we need to know its exact distribution (while figuring out the exact distribution is usually the job from statisticians!). For us (to conduct applied work), we only need to know it and to apply it! Next, we come to the details. AF6208: Econometrics Methods Dr. Steven X. Wei Review (from the last lecture): Under CLR assumptions: OLS estimates (or more exactly estimators) are BLUE! In order to make classical hypothesis testing, we need to add another assumption: ? [the disturbance] is independent of x1, x2, , xK and normally distributed with mean 0 and variance ?2, i.e., ? N (0, ? 2 ) . AF6208: Econometrics Methods Dr. Steven X. Wei What is a normal distribution? The standard normal distributions probability density function is graphed below: f (? ) 0 ? AF6208: Econometrics Methods Dr. Steven X. Wei Under the above assumption, we can derive the following result t= ? j ??j se( ? j ) tn ?( K +1) . Note: 1. This is a Student t-distribution (to be derived from Appendix A in this note)! The t-statistic reported everywhere (including those in your future thesis) is coming from the ratio: t = ? j se( ? j ). Where can you find the values of ? j and se( ? j ) ? They are from your computer output of running the regression model. 2. The degrees of freedom is n  (K+1). AF6208: Econometrics Methods Dr. Steven X. Wei How to conduct a t-test A simple case first: 1. Start with a null hypothesis (say, for some 0 ? j ? K): H0: ?j = 0 vs H1: ?j ? 0 2. If we reject the null hypothesis, it means that xj significantly affect y, after controlling other x/s. (In most of cases, this is what we wish to see!) 3. We use t-test: t= ? j ?0 se( ? j ) = ?j se( ? j ) . AF6208: Econometrics Methods Dr. Steven X. Wei 4. We need to choose a significance level, ?, while the conventional values of ? take the following numbers: 1%, 5% and 10%. (What is the meaning of ??). 5. Find the critical value C from the t-table (see my class distribution). Note that the rejection region is |t| > C.
6. Now we calculate the t-value from t =
? j ?0
se( ? j )
=
?j
se( ? j )
.
7. If the t-value (often called t-statistic) is in the rejection region, i.e.,
|t| > C, then we reject the null hypothesis H0. Otherwise, we fail to
reject H0.
AF6208: Econometrics Methods
Dr. Steven X. Wei
t  distribution with the degrees of freedom n  K  1
Reject H0
Reject H0
?/2
?/2
Fail to reject H0
-C
0
C
Critical Value
AF6208: Econometrics Methods
Dr. Steven X. Wei
One-sided test:
The above test is a two-sided test since the rejection region is in both
sides (or tails) of the t-distribution.
In fact, we could have one-sided test,
H0: ?j < 0 vs H1: ?j ? 0 All the things are the same as those for a two-sided test, except the rejection region is now t > C, while the t-test statistic is computed as
the same as before:
?j
t=
.
se( ? j )
AF6208: Econometrics Methods
Dr. Steven X. Wei
t  distribution with the degrees of freedom n  K  1
Reject H0
?
Fail to reject H0
0
C
Critical Value
AF6208: Econometrics Methods
Dr. Steven X. Wei
Figure out the case with the following one-sided test:
H0: ?j ? 0 vs H1: ?j < 0 Think of Example 3.1. AF6208: Econometrics Methods Dr. Steven X. Wei Example 3.2 Consider the estimated equation (by using the OLS): log(wage) = 0.284 + 0.092educ + 0.0041exper + 0.022tenure (0.104) (0.007) (0.0017) (0.003) n = 526, R 2 = .316 Questions: 1. What do the reported statistics in the above equation mean? 2. Is return (i.e, wage) on experience zero? Conduct a formal test on this issue at the 5% level of significance. AF6208: Econometrics Methods Dr. Steven X. Wei Solution: We form our hypothesis as: H0: ?exper = 0 vs. H1:?exper > 0.
[Note: In applications, indexing a parameter by its associated
variable name is a nice way to label parameters!]
We use t-test whos degrees of freedom is __________________.
Given the level of significance ? = 5%, the critical value C is _____.
Now we calculate the t-statistic, which is t = _________________.
Since_________, we ____________ the null hypothesis. What does
this mean in terms of economics? It means ___________________
______________________________________________________.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Summary for H0: ?j =0 (As this is the mostly used case)
 Unless otherwise stated, the alternative hypothesis is always
assumed to be two-sided;
 If we reject the null, we typically say xj is statistically significant
at the ? (= 1%, 5%, 10%) level;
 If we fail to reject the null, we typically say xj is statistically
insignificant at the ? (= 1%, 5%, 10%) level.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Example 3.3 (A real case in a published paper)
A JAE paper (2014) Accounting earnings and gross domestic
product by Konchitchki and Patatoukas.
The abstract of the paper:
We document that aggregate accounting earnings growth is an
incrementally significant leading indicator of growth in nominal
Gross Domestic Product (GDP). Professional macro forecasters,
however, do not fully incorporate the predictive content
embedded in publicly available accounting earnings data. As a
result, future nominal GDP growth forecast errors are predictable
based on accounting earnings data that are available to
professional macro forecasters in real time.
AF6208: Econometrics Methods
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
Testing other hypothesis
A bit more general case of hypothesis testing is:
H0: ?j = a vs H1: ?j ? a
where a is a known constant. In this case, everything is the same as
before, except that the t-statistic now is:
t=
? j ?a
se( ? j )
.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Confidence Intervals:
What is the confidence interval for a parameter ?j?
( )
? j ? C * se ? j ,
? ??
where C is the ?1- ? percentile in a tn ?( K +1) distribution.
? 2?
Remark: The confidence interval could be viewed in two different
ways. First, it is understood as an interval estimate (of ?j). Second, it
is viewed as an alternative t-test for the hypothesis H0: ?j = a.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Example 3.4
Model of R&D Expenditures
log(rd) = ?4.38 + 1.084 log( sales ) + .217profmarg
(.47) (.060)
(.0218)
n =32, R 2 = .918
Questions:
1. What is the interpretation of the estimate, 1.084?
2. Construct a 95% confidence interval for the sales elasticity?
AF6208: Econometrics Methods
Solution:
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
p-value (for t-test, at this stage)
 An alternative to the classical approach is to ask, what is the
smallest significance level at which the null would be rejected?
 So, compute the t statistic, and then look up what percentile it is
in the appropriate t distribution  this is the p-value!
 p-value is the probability we would observe the obtained t
statistic, if the null hypothesis were true.
For example: H0: ?j = 0 vs H1: ?j ? 0
We first calculate t-value: t =
?j
se( ? j )
, lets call it t0.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Then you could think about that it is located in the following
t  distribution with the degrees of freedom n  K  1
tail prob.
t0
AF6208: Econometrics Methods
Dr. Steven X. Wei
Lets go back to insert this to our original idea:
Reject H0
Reject H0
?/2
p/2
Fail to reject H0
0
-t0
?/2
C
Critical Value
t0
p/2
AF6208: Econometrics Methods
Dr. Steven X. Wei
Note:
 Most computer packages will compute the p-value for you,
assuming a two-sided test;
 If you really want a one-sided alternative, just divide the twosided p-value by 2 [Originally, two sided-test P(|t| > t0) = p. Now,
one-sided test with P(t > t0) = (1/2) P(|t| > t0) = p/2];
 Stata/SAS, for example, provides the t statistic, p-value, and 95%
confidence interval for H0: ?j = 0 for you, in columns labeled t,
P > |t| and [95% Conf. Interval], respectively.
 This means that you need to understand the p-value, instead of
how to exactly compute it in practice!
AF6208: Econometrics Methods
Dr. Steven X. Wei
Importantly you need to know:
p value < ? (level of significance) means that we reject the null hypothesis at ? level of significance. (Why?) Of course, p value > ? (level of significance) means that we cannot
reject the null hypothesis at ? level of significance.
This is what you need to clearly understand when you read research
AF6208: Econometrics Methods
Dr. Steven X. Wei
Testing a linear combination:
 Suppose instead of testing whether ?1 is equal to a constant, you
want to test if it is equal to another parameter, that is H0 : ?1 = ?2;
[Note that in practice, we often write the hypothesis as H0 : ?1 +
(-?2) = 0, i.e., a linear combination of the two parameters!]
 Use the same basic procedure for forming a t statistic:
?1 ? ?2
t=
.
se ?1 ? ?2
(
)
AF6208: Econometrics Methods
Dr. Steven X. Wei
(
The key issue here is to derive se ?1 ? ?2
(
)
(
)
.
)
Var ( ? ? ? ) = Var ( ? ) + Var ( ? ) ? 2Cov ( ? , ? )
Since se ?1 ? ?2 = Var ?1 ? ?2 , we then consider
1
(
2
)
?
( )
2
1
( )
is an estimate of Cov ( ? , ? ) .
se ?1 ? ?2 =
where s12
1
2
2
? se ?1 ? + ? se ?2 ? ? 2s12
?
? ?
?
1
?
2
1
2
2
Note: In practice, you could obtain Cov ( ? , ? ) from any
econometric software used to run the regression.
1
2
AF6208: Econometrics Methods
Dr. Steven X. Wei
Re-parameterization (An alternative approach!):
Set ?1 = ?1 – ?2. Then solving it out for ?1, we have ?1 =?1 + ?2.
Substituting it into the original regression model:
y = ?0 + (?1 + ? 2 ) x1 + ? 2 x2 + … + ? K xK + ? .
Rearranging it yields,
y = ?0 + ?1 x1 + ? 2 ( x1 + x2 ) + ?3 x3 … + ? K xK + ? .
Now rerun the regression.
Testing H0: ?1 = ?2 is equivalent to testing H0 : ?1 = 0.
AF6208: Econometrics Methods
Dr. Steven X. Wei
This is to use re-parameterization first, and then do the test in an
easy way!
AF6208: Econometrics Methods
Dr. Steven X. Wei
Multiple linear restrictions
 Everything weve done so far has involved testing a single linear
restriction, (e.g. ?1 = 0 or ?1 = ?2 );
 However, we may want to jointly test multiple hypotheses about
our parameters;
 A typical example is to test exclusion restrictions  we want to
know if a group of parameters are all equal to zero;
AF6208: Econometrics Methods
Dr. Steven X. Wei
Multiple linear restrictions
 Now the null hypothesis might be something like H0: ?k-q+1 =
0, … , ?k = 0;
 The alternative is just H1: H0 is not true;
 Cannot just check each t statistic separately, because we want
to know if the q parameters are jointly significant at a given
significance level, although it is possible for none to be
individually significant at that level.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Exclusion Restrictions
 To do the test we need to estimate the restricted model without
xk-q+1,, , xk included, as well as the unrestricted model with
all xs included;
 Intuitively, we want to know if the change in SSR is big enough
to warrant inclusion of xk-q+1,, , xk.
F ?
( SSRr
SSRur
? SSRur ) q
,
( n ? k ? 1)
where r is restricted and ur is unrestricted.
AF6208: Econometrics Methods
Dr. Steven X. Wei
The F statistic
 The F statistic is always positive, since the SSR from the
restricted model cant be less than the SSR from the unrestricted;
 Essentially the F statistic is measuring the relative increase in
SSR when moving from the unrestricted to restricted model;
 q = number of restrictions, or dfr  dfur; n  k  1 = dfur.
 To decide if the increase in SSR when we move to a restricted
model is big enough to reject the exclusions, we need to know
about the sampling distribution of our F stat.
AF6208: Econometrics Methods
Dr. Steven X. Wei
 Not surprisingly, F ~ Fq,n-k-1, where q is referred to as the
numerator degrees of freedom and n  k  1 as the denominator
degrees of freedom.
AF6208: Econometrics Methods
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
The R2 form of the F statistic
Because the SSRs may be large and unwieldy, an alternative form of
the formula is useful;
We use the fact that SSR = SST(1  R2) for any regression, so can
substitute in for SSRr and SSRur:
F=
2
2
R
?
R
( ur r ) q
(1 ? R ) ( n ? k ? 1)
2
ur
.
AF6208: Econometrics Methods
Dr. Steven X. Wei
A special case of exclusion restrictions is to test H0: ?1 = ?2 == ?k
= 0 (while most of software would report the F-value for this null
hypothesis!);
Since the R2 from a model with only an intercept will be zero, the F
statistic is simply:
R2 k
F=
.
2
(1 ? R ) ( n ? k ? 1)
AF6208: Econometrics Methods
Dr. Steven X. Wei
General Linear Restrictions
 The basic form of the F statistic will work for any set of linear
restrictions;
 First estimate the unrestricted model and then estimate the
restricted model;
 In each case, make note of the SSR;
 Imposing the restrictions can be tricky  will likely have to redefine
variables again.
AF6208: Econometrics Methods
Dr. Steven X. Wei
F Statistic Summary
 Just as with t statistics, p-values can be calculated by looking up
the percentile in the appropriate F distribution;
 Stata (package), for example, will do this by entering: display
fprob(q, n  k  1, F), where the appropriate values of F, q,and n
 k  1 are used;
 If only one exclusion is being tested, then F = t2, and the p-values
will be the same.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Appendix 1
It is noted that ? follows a normal distribution N(0, ?2) implies that
the expected value of ?, is 0, i.e., E(?) = 0 and the variance of ? is
?2, i.e., Var(?) =?2.
Under the assumption, we can write
y | x1, x2 ,…xK N (?0 + ?1×1 + ?2 x2 + … + ?K xK ,? 2 )
Remark: You may ask why we make the normality assumption
of ?. Good question! Keep it in your mind and we will come back to
the issue again soon.
AF6208: Econometrics Methods
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
Normal Sampling Distribution
Under the CLR assumptions and the above normal assumption of ?,
we have the important result for the estimator of ?j:
(
? j N ? j ,Var (? j )
)
Standardized,
? j ??j
sd ( ? j )
N ( 0,1) .
AF6208: Econometrics Methods
Dr. Steven X. Wei
Understanding:
?j
is distributed normally, because it is a linear combination of
errors.
sd (? j ) = Var ( ? j ) =
?
SST j (1 ? R
2
j
)
.
If ?2 is known, then sd (? j ) is known. Then we can calculate
Z=
? j ??j
sd ( ? j )
which we can use to do a normal test.
AF6208: Econometrics Methods
Dr. Steven X. Wei
However, ?2 is usually unknown! We have to replace it by its
2
( )
Var ? j =
?
2
SST j (1 ? R
2
j
)
,
where it is emphasized that the above is the estimate of Var ( ? j ) , rather
than Var ( ? j ) itself! Usually, let us repeat, we use its square root,
se( ? j ) = Var ( ? j ) =
?
SST j (1 ? R 2j )
.
AF6208: Econometrics Methods
We obtain:
t=
Dr. Steven X. Wei
? j ??j
se( ? j )
tn ?( K +1) .
Note: 1. This is a (Student) t-distribution (vs normal distribution)
2
2
because we have to estimate ? by ? . You should always remember
and be familiarize with this t-test statistic! You need the value
with regression analysis.
2. The degrees of freedom is n  (K+1).
AF6208: Econometrics Methods
Dr. Steven X. Wei
Lecture Note 04
Linear Regression Model:
Asymptotics, Further Issues,
and Heteroscedasticity
AF6208: Econometrics Methods
Dr. Steven X. Wei
I. Asymptotic Properties [A bit technical!]
Recall: How to judge the performance of an estimator in finite
sample?
Two criteria (in finite sample):
(1) Unbiasedness
(2) Efficiency
In this section, we upgraded the two criteria into two new ones,
in large sample!
Two criteria (in large sample):
(1) Consistency
(2) Asymptotical Efficiency
AF6208: Econometrics Methods
Dr. Steven X. Wei
Under the Gauss-Markov assumptions, OLS estimators are
BLUE, but in other cases it wont always be possible to find
unbiased estimators (we will come to some examples in the
next topic!)
In those cases, we settle for estimators that are consistent,
meaning as n ? ? (i.e., the sample size is large), the
distribution of the estimator is shrunk to the (true) parameter
value.
Look at the intuition on the next page!
AF6208: Econometrics Methods
Dr. Steven X. Wei
Graphical understanding of consistency:
?
AF6208: Econometrics Methods
Dr. Steven X. Wei
Convergence in Probability
plim? n = ? .
Description (See the mathematical details in Appendix A):
Random variables ? n (think of it as an estimator of ? with sample size
n) converge in probability to a constant ? if the distributions of
? n become more and more concentrated on ? when sample size n is
getting larger and larger, and eventually collapse to the constant ?.
For convenience and convention, we write it mathematically as
plim? n = ? .
n ??
AF6208: Econometrics Methods
Dr. Steven X. Wei
Definition: An estimator ? n of ? is consistent if
plim? n = ? .
Example 4.1
The sample mean, ? n ( ? X ) , is a consistent estimator of
population mean ?. (Why?)
This is also known as the Law of Large Numbers!
In fact, a general result is sample moments approach
population moments under mild conditions  important!
AF6208: Econometrics Methods
Dr. Steven X. Wei
If Xi and Zi, i=1,2,n, are the samples of X and Z, respectively, then
we have:
 Sample variance is a consistent estimator of population variance
1 n
plim n ? 1 ? ( X i ? X )2 = Var ( X ).
i =1
n ??
Usually, we write sx2
?x2.
p
 Sample covariance ? Population covariance
1 n
plim n ? 1 ? ( X i ? X )(Zi ? Z ) = Cov( X , Z ).
i =1
n ??
Usually, we write sxY2
p
?xy2.
AF6208: Econometrics Methods
Dr. Steven X. Wei
 Sample moments ? Population moments, under mild conditions.
1 n r
plim n ? X i = E ( X r ).
i =1
n ??
One important property of consistency (or convergence in
probability) is: It is closed to any continuous functions!
What does it mean? Suppose h() is a continuous function (no
dis-continuity at any point!). If plim? n = ? , then p lim h(? n ) = h(? ).
Similarly, if plim? n = ? and plim? n = ? , then p lim h(? n + ? n ) = h(? + ? ).
This is a very useful and general result when you work on some
AF6208: Econometrics Methods
Dr. Steven X. Wei
Consistency of OLS
 Under the Gauss-Markov assumptions, the OLS estimator is
consistent (and of course it is unbiased too, as we have shown
before);
 Consistency can be proved for the simple regression case in a
manner similar to the proof of unbiasedness?
 Will need to take probability limit (plim) to establish consistency.
Note: You may not fully understand the mathematics, but you need
understand the thought and major results of consistency!
AF6208: Econometrics Methods
Dr. Steven X. Wei
Proving the consistency of ?1 in the simple linear regression
model y = ?0 + ?1 x + ? :
?1 =
? ( x ? x )y
?( x ? x )
i
i
2
Remember this!
i
(x
?
=
? x )( ? 0 + ?1 xi + ? i )
i
?( x
i
( x ? x )?
?
=
?( x ? x )
i
0
2
i
=
0
? x)
2
Substitute yi = ? 0 + ?1 xi + ? i
( x ? x )? x ? ( x ? x )?
?
+
+
?(x ? x )
?(x ? x )
( x ? x )?
?
+
?
+
?( x ? x )
i
1 i
2
i
i
i
2
i
i
1
i
i
2
AF6208: Econometrics Methods
1
x
?
x
?
(
)
?
i
i
x
?
x
?
(
)
?
i
i
n
?1 = ?1 +
=
?
+
1
2
2
1
x
?
x
(
)
? i
( xi ? x )
?
n
C ov( x, ? )
? ?1 +
= ?1.
Var ( x)
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
Deriving the inconsistency (similar to bias)
Just as we could derive the omitted variable bias earlier, now we
want to think about the inconsistency, or asymptotic bias, in this
case.
True model: y = ? 0 + ?1 x1 + ? 2 x2 + v
You think: y = ? 0 + ?1 x1 + u , so that
u = ? 2 x2 + v and, plim?1 = ?1 + ? 2?
where ? = Cov ( x1 , x2 ) Var ( x1 )
AF6208: Econometrics Methods
Dr. Steven X. Wei
Asymptotic Bias (i.e, Inconsistency!)
 So, thinking about the direction of the asymptotic bias is just like
thinking about the direction of bias for an omitted variable;
 Main difference is that asymptotic bias uses the population
variance and covariance, while bias uses the sample counterparts;
 Remember, inconsistency is a large sample problem  it doesnt
go away as you increase the sample size!
AF6208: Econometrics Methods
Dr. Steven X. Wei
Large Sample Inference (Main Idea)
 Recall that under the CLM assumptions, the sampling
distributions (of the coefficient estimators) are normal, so we
could derive t and F distributions for testing;
 This exact normality was due to assuming the population error
distribution was normal;
 This assumption of normal errors implied that the distribution of
y, given the xs, was normal as well.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Large Sample Inference
 Easy to come up with examples for which this exact normality
assumption will fail;
 Any clearly skewed variable, like wages, savings, etc. cant be
normal, since a normal distribution is symmetric;
 Normality assumption is not needed to conclude OLS is BLUE,
only for inference.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Central Limit Theorem
 Based on the central limit theorem, we can show that OLS
estimators are asymptotically normal;
 Asymptotic normality implies that P(Zn 0
Dr. Steven X. Wei
AF6208: Econometrics Methods
Example 4.2: Dummy Variable
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
Example for Dummy Variable
Compare it (on the last page) with the following estimated
model (p. 221):
wage = 7.10 ? 2.51 female
(.21)
(.30)
n = 526, R = .116
2
What can you learn from the different results of the two models?
AF6208: Econometrics Methods
Dr. Steven X. Wei
Dummies for Multiple Categories
 We can use dummy variables to control for something with
multiple categories;
 Suppose everyone in your data is either a HS dropout, HS
 To compare HS and college grads to HS dropouts, include 2
dummy variables;
AF6208: Econometrics Methods
Dr. Steven X. Wei
Multiple Categories (cont)
 Any categorical variable can be turned into a set of dummy
variables;
 Because the base group is represented by the intercept, if
there are n categories there should be n  1 dummy
variables;
 If there are a lot of categories, it may make sense to group
some together (practically);
 Example: top 10 ranking, 11  25, etc.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Example 4.3: Multiple Categories
 Test (monthly) seasonality in stock returns
Each monthly dummy variable has a value of 1 when the
month occurs and a value of 0 for the other months.
Returnst = .0301 + .0003 Jant ? .0111Febt + ….. ? .0059 Novt
(.0166) (.0176)
n = 288, R 2 = .0574.
(.0164)
(.0164)
AF6208: Econometrics Methods
Dr. Steven X. Wei
Interactions among Dummies
 Interacting dummy variables is like subdividing the group;
 Example: have dummies for male, as well as hsgrad and colgrad;
variables > 6 categories;
 Base group is female HS dropouts;
AF6208: Econometrics Methods
Dr. Steven X. Wei
More on Dummy Interactions
 Formally, the model is y = ?0 + ?1male + ?2hsgrad + ?3colgrad
 If male = 0 and hsgrad = 0 and colgrad = 0
y = ?0 + ?1x + ?
 If male = 0 and hsgrad = 1 and colgrad = 0
y =?0 + ?2hsgrad + ?1x +?
 If male = 1 and hsgrad = 0 and colgrad = 1
y = ?0 + ?1male + ?3colgrad + ?5male*colgrad + ?1x + ?
AF6208: Econometrics Methods
Dr. Steven X. Wei
Other Interactions with Dummies
 Can also consider interacting a dummy variable, d, with a
continuous variable, x
y = ?0 + ?0d + ?1x + ?1d*x + ?
 If d = 0, then y = ?0 + ?1x + ?
 If d = 1, then y = (?0 + ?0) + (?1+ ?1) x + ?
This is interpreted as a change in the slope if ?1 ? 0
(as well as a change in the intercept, if ?0 ? 0)
AF6208: Econometrics Methods
Example of ?0 > 0 and ?1 < 0 Dr. Steven X. Wei AF6208: Econometrics Methods Dr. Steven X. Wei Testing for Differences across Groups  Testing whether a regression function is different for one group versus another can be thought of as simply testing for the joint significance of the dummy and its interactions with all other x variables.  So, you can estimate the model with all the interactions and without and form an F statistic, but this could be unwieldy. AF6208: Econometrics Methods Dr. Steven X. Wei The Chow Test  Turns out you can compute the proper F statistic without running the unrestricted model with interactions with all k continuous variables;  If run the restricted model for group one and get SSR1, then for group two get SSR2;  Run the restricted model for all to get SSR, then: ?? SSR ? ( SSR1 + SSR2 )?? ??n ? 2 ( k + 1)?? F= * . SSR1 + SSR2 k +1 AF6208: Econometrics Methods Dr. Steven X. Wei The Chow Test (continued)  The Chow test is really just a simple F test for exclusion restrictions, but we have realized that SSRur = SSR1 + SSR2;  Note, we have k + 1 restrictions (each of the slope coefficients and the intercept);  Note the unrestricted model would estimate 2 different intercepts and 2 different slope coefficients, so the df is n  2k  2. AF6208: Econometrics Methods Dr. Steven X. Wei IV. Heteroskedasticity  Recall the assumption of homoskedasticity implied that conditional on the explanatory variables, the variance of the unobserved error, u, was constant;  If this is not true, that is if the variance of u is different for different values of the xs, then the errors are heteroskedastic;  Example: Estimating returns to education and ability is unobservable, and think the variance in ability differs by educational attainment. AF6208: Econometrics Methods Example of Homoskedasticity Dr. Steven X. Wei AF6208: Econometrics Methods Regressions with Homoskedasticity Dr. Steven X. Wei AF6208: Econometrics Methods Example of Heteroskedasticity Dr. Steven X. Wei AF6208: Econometrics Methods Regressions with Heteroskedasticity Dr. Steven X. Wei AF6208: Econome