Description

3 attachmentsSlide 1 of 3attachment_1attachment_1attachment_2attachment_2attachment_3attachment_3.slider-slide > img { width: 100%; display: block; }

.slider-slide > img:focus { margin: auto; }

Unformatted Attachment Preview

Econometric Methods (AF6208)

Dr. Steven X.. Wei

Problem Set II [Due time and date: 11:00 pm on 17 April 2022]

Note: I design the problem set for three purposes: (1) your understanding of the lecture

notes; (2) reading research papers and working on your research projects/dissertation; and

(3) preparing for your take-home exam (which will include some similar types of

questions to those in the problem set). It is strongly suggested to work on the problem set

independently first, and then get together to discuss your solutions within your group.

Please stay in the same group as in Problem Set I. Each group submits ONE write-up,

while please try your best to type your solution as much as possible.

Questions from your textbook:

Chapter 6: Q3

Chapter 7: Q9

Chapter 8: Q4

Chapter 15: Q1, Q4 and Q7.

Additional questions:

Q1. Understand some Important Issues

(1) Suppose the true model is given by

y ? ? 0 ? ?1 x1 ? ? 2 x2 ? ?

while you estimate (a wrong model below)

y ? ? 0 ? ?1 x1 ? ?

Answer the following questions:

a. Demonstrate the OLS estimate of ?1 is inconsistent in general. Explain in

which special situations, it can be still consistent.

b. What is the implication of the issue on your future research?

(2) Explain p-value when you conduct a t-test and F-test.

[Note that a p-value is closely related to your null hypothesis! If you change your

null hypothesis, then the p-value will be different too! In addition, for any

hypothesis testing, we can calculate its p-value! Think of why empirical analysts

prefer to report p-values in their reports or papers.]

(3) In your own research, if you have the issue of (conditional) heteroskedasticity in

your regression, what would you worry about the estimated parameters and/or

your inferences? Explain your answer in detail.

Econometric Methods (AF6208)

Dr. Steven X.. Wei

Q2. A Case Study (Imagine you are working on a project/thesis)

You are asked to conduct an economic analysis in HK labor market. The major issue to

be asked is whether female is discriminated (in terms of salary) in the labor market. A

student randomly collected 1000 observations with 500 males and 500 females. She

simply calculated the monthly average salary, $25000, for females and the salary average,

$30000, for males. Then she concludes that females are discriminated in the labor market

(in terms of salary). [Hint: Use a dummy variable in your research design!]

(1) Make your comments on the possible problem(s) on the inference made by the

student. [Note that this is quite a general problem in the real world!]

(2) If you were the student, how could you design your test to make a better inference?

(3) Would you like to include control variables in your analysis? Why?

(4) Are there any possible econometric (or economics) issues on which you have

concerns in your analysis? Explain your problem(s), if any, and propose your

solutions to the problem(s).

[This question gives you a rough and real situation on how the research in economics is

conducted, and how the econometric inferences play an important role in the process!

Hopefully, this helps you structure your independent studies and thesis.]

Q3. Understand Endogeneity:

(1) What is an endogeneity issue? Why should we care about it?

(2) Provide a few (three or more) economic situations in which an endogeneity issue

arises.

(3) What is an instrumental variable for an endogenous variable?

(4) What is the two-stage least square estimate of a parameter?

(5) Explain intuitively why using instrumental variables can (at least partially) solve

the endogenous problem.

(6) What are under-identified, exactly-identified and over-identified cases?

(7) Discuss and comment on the case of weak instrumental variables [Read the

textbook and explain it intuitively].

(8) Search in the internet or any other places, and then talk about the key ideas of

other ways (i..e, non IV method) of dealing with the endogeneity issue.

Q4. Suppose that we have two estimators,

?? n and ?? n

of parameters ? and ?, respectively, where n represents the sample size. Answer the

following questions:

If both of the estimators are consistent,

Econometric Methods (AF6208)

(1) would

Dr. Steven X.. Wei

?? n + ?? n be a consistent estimator of ? + ? ?

?? n be a consistent estimator of ? – ? ?

(2) would

?? n

(4) would

?? n * ?? n be a consistent estimator of ? * ? ?

–

? n be a consistent estimator of 2? ?

(3) would 2 ?

(5) would

?0) ?

?? n / ?? n be a consistent estimator of ? / ? (of course, we assume that

?

Can you summarize the law or rule here for your future references?

. This is the end of questions.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Lecture Note 03

Linear Regression Model:

Inference

AF6208: Econometrics Methods

Dr. Steven X. Wei

Read textbook: Wooldridges book Ch.4

You may read Greens book Ch. 5 and Ch. 6, if you wish to learn this part

more academically.

This section considers conducting a statistical inference with a multiple

linear regression model.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Economics/Any Other Field

Ask a good question: Forming a hypothesis

Econometric Model

Data

(observable)

(Constraints on the interested

Parameters)

Parameters

(non-observable)

Parameters

of Interest

Nuisance

Parameters

AF6208: Econometrics Methods

Dr. Steven X. Wei

Example 3.1. (Used as the Motivation)

Suppose you ask a sensitive question: Are females discriminated in

the labour market (in terms of salary)?

We frame it into an Econometric Model:

wage = ? 0 + ?1 female + ? 2 educ + ?

Data = {wage, female, educ}

(observable)

Parameters ? = {?0, ?1, ?2??2}

(Non-observable)

Parameter of interest is: ?1

Your hypothesis is actually a

constrain on the interested

parameter: ?1 < 0.
AF6208: Econometrics Methods
Dr. Steven X. Wei
How to assess rightly whether a parameter ?1 < 0, given that it is
unobservable?
The main idea is roughly following:
1. Given the hypothesis (what you would like to assess), you need
to find a vehicle (i.e., a test statistic) to assess if the hypothesis is
rejected.
2. The assessment is a probability assessment, rather than 100%
yes or no. Never say 100% yes or no, in statistics! To do so, we
use the term level of significance to cook the statistical
language. [I assume that you have learnt the basic terms before
or you should read the part of Lecture Note 01 on hypothesis
testing!]
AF6208: Econometrics Methods
Dr. Steven X. Wei
3. For the vehicle, i.e., the test statistic, we need to know its exact
distribution (while figuring out the exact distribution is usually
the job from statisticians!). For us (to conduct applied work), we
only need to know it and to apply it!
Next, we come to the details.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Review (from the last lecture): Under CLR assumptions:
OLS estimates (or more exactly estimators) are BLUE!
In order to make classical hypothesis testing, we need to add
another assumption:
? [the disturbance] is independent of x1, x2,
, xK and normally
distributed with mean 0 and variance ?2, i.e.,
?
N (0, ? 2 ) .
AF6208: Econometrics Methods
Dr. Steven X. Wei
What is a normal distribution? The standard normal
distributions probability density function is graphed
below:
f (? )
0
?
AF6208: Econometrics Methods
Dr. Steven X. Wei
Under the above assumption, we can derive the following result
t=
? j ??j
se( ? j )
tn ?( K +1) .
Note: 1. This is a Student t-distribution (to be derived from
Appendix A in this note)! The t-statistic reported everywhere
(including those in your future thesis) is coming from the ratio:
t = ? j se( ? j ).
Where can you find the values of ? j and se( ? j )
? They are from
your computer output of running the regression model.
2. The degrees of freedom is n (K+1).
AF6208: Econometrics Methods
Dr. Steven X. Wei
How to conduct a t-test
A simple case first:
1. Start with a null hypothesis (say, for some 0 ? j ? K):
H0: ?j = 0 vs H1: ?j ? 0
2. If we reject the null hypothesis, it means that xj significantly affect
y, after controlling other x/s. (In most of cases, this is what we
wish to see!)
3. We use t-test:
t=
? j ?0
se( ? j )
=
?j
se( ? j )
.
AF6208: Econometrics Methods
Dr. Steven X. Wei
4. We need to choose a significance level, ?, while the conventional
values of ? take the following numbers: 1%, 5% and 10%. (What
is the meaning of ??).
5. Find the critical value C from the t-table (see my class
distribution). Note that the rejection region is |t| > C.

6. Now we calculate the t-value from t =

? j ?0

se( ? j )

=

?j

se( ? j )

.

7. If the t-value (often called t-statistic) is in the rejection region, i.e.,

|t| > C, then we reject the null hypothesis H0. Otherwise, we fail to

reject H0.

AF6208: Econometrics Methods

Dr. Steven X. Wei

t distribution with the degrees of freedom n K 1

Reject H0

Reject H0

?/2

?/2

Fail to reject H0

-C

0

C

Critical Value

AF6208: Econometrics Methods

Dr. Steven X. Wei

One-sided test:

The above test is a two-sided test since the rejection region is in both

sides (or tails) of the t-distribution.

In fact, we could have one-sided test,

H0: ?j < 0 vs H1: ?j ? 0
All the things are the same as those for a two-sided test, except the
rejection region is now t > C, while the t-test statistic is computed as

the same as before:

?j

t=

.

se( ? j )

AF6208: Econometrics Methods

Dr. Steven X. Wei

t distribution with the degrees of freedom n K 1

Reject H0

?

Fail to reject H0

0

C

Critical Value

AF6208: Econometrics Methods

Dr. Steven X. Wei

Figure out the case with the following one-sided test:

H0: ?j ? 0 vs H1: ?j < 0
Think of Example 3.1.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Example 3.2
Consider the estimated equation (by using the OLS):
log(wage) = 0.284 + 0.092educ + 0.0041exper + 0.022tenure
(0.104) (0.007)
(0.0017)
(0.003)
n = 526, R 2 = .316
Questions:
1. What do the reported statistics in the above equation mean?
2. Is return (i.e, wage) on experience zero? Conduct a formal test
on this issue at the 5% level of significance.
AF6208: Econometrics Methods
Dr. Steven X. Wei
Solution:
We form our hypothesis as: H0: ?exper = 0 vs. H1:?exper > 0.

[Note: In applications, indexing a parameter by its associated

variable name is a nice way to label parameters!]

We use t-test whos degrees of freedom is __________________.

Given the level of significance ? = 5%, the critical value C is _____.

Now we calculate the t-statistic, which is t = _________________.

Since_________, we ____________ the null hypothesis. What does

this mean in terms of economics? It means ___________________

______________________________________________________.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Summary for H0: ?j =0 (As this is the mostly used case)

Unless otherwise stated, the alternative hypothesis is always

assumed to be two-sided;

If we reject the null, we typically say xj is statistically significant

at the ? (= 1%, 5%, 10%) level;

If we fail to reject the null, we typically say xj is statistically

insignificant at the ? (= 1%, 5%, 10%) level.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Example 3.3 (A real case in a published paper)

A JAE paper (2014) Accounting earnings and gross domestic

product by Konchitchki and Patatoukas.

The abstract of the paper:

We document that aggregate accounting earnings growth is an

incrementally significant leading indicator of growth in nominal

Gross Domestic Product (GDP). Professional macro forecasters,

however, do not fully incorporate the predictive content

embedded in publicly available accounting earnings data. As a

result, future nominal GDP growth forecast errors are predictable

based on accounting earnings data that are available to

professional macro forecasters in real time.

AF6208: Econometrics Methods

Dr. Steven X. Wei

AF6208: Econometrics Methods

Dr. Steven X. Wei

Testing other hypothesis

A bit more general case of hypothesis testing is:

H0: ?j = a vs H1: ?j ? a

where a is a known constant. In this case, everything is the same as

before, except that the t-statistic now is:

t=

? j ?a

se( ? j )

.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Confidence Intervals:

What is the confidence interval for a parameter ?j?

( )

? j ? C * se ? j ,

? ??

where C is the ?1- ? percentile in a tn ?( K +1) distribution.

? 2?

Remark: The confidence interval could be viewed in two different

ways. First, it is understood as an interval estimate (of ?j). Second, it

is viewed as an alternative t-test for the hypothesis H0: ?j = a.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Example 3.4

Model of R&D Expenditures

log(rd) = ?4.38 + 1.084 log( sales ) + .217profmarg

(.47) (.060)

(.0218)

n =32, R 2 = .918

Questions:

1. What is the interpretation of the estimate, 1.084?

2. Construct a 95% confidence interval for the sales elasticity?

AF6208: Econometrics Methods

Solution:

Dr. Steven X. Wei

AF6208: Econometrics Methods

Dr. Steven X. Wei

p-value (for t-test, at this stage)

An alternative to the classical approach is to ask, what is the

smallest significance level at which the null would be rejected?

So, compute the t statistic, and then look up what percentile it is

in the appropriate t distribution this is the p-value!

p-value is the probability we would observe the obtained t

statistic, if the null hypothesis were true.

For example: H0: ?j = 0 vs H1: ?j ? 0

We first calculate t-value: t =

?j

se( ? j )

, lets call it t0.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Then you could think about that it is located in the following

diagram (to help you understand):

t distribution with the degrees of freedom n K 1

tail prob.

t0

AF6208: Econometrics Methods

Dr. Steven X. Wei

Lets go back to insert this to our original idea:

Reject H0

Reject H0

?/2

p/2

Fail to reject H0

0

-t0

?/2

C

Critical Value

t0

p/2

AF6208: Econometrics Methods

Dr. Steven X. Wei

Note:

Most computer packages will compute the p-value for you,

assuming a two-sided test;

If you really want a one-sided alternative, just divide the twosided p-value by 2 [Originally, two sided-test P(|t| > t0) = p. Now,

one-sided test with P(t > t0) = (1/2) P(|t| > t0) = p/2];

Stata/SAS, for example, provides the t statistic, p-value, and 95%

confidence interval for H0: ?j = 0 for you, in columns labeled t,

P > |t| and [95% Conf. Interval], respectively.

This means that you need to understand the p-value, instead of

how to exactly compute it in practice!

AF6208: Econometrics Methods

Dr. Steven X. Wei

Importantly you need to know:

p value < ? (level of significance) means that we reject the null
hypothesis at ? level of significance. (Why?)
Of course, p value > ? (level of significance) means that we cannot

reject the null hypothesis at ? level of significance.

This is what you need to clearly understand when you read research

papers and write your dissertation!

AF6208: Econometrics Methods

Dr. Steven X. Wei

Testing a linear combination:

Suppose instead of testing whether ?1 is equal to a constant, you

want to test if it is equal to another parameter, that is H0 : ?1 = ?2;

[Note that in practice, we often write the hypothesis as H0 : ?1 +

(-?2) = 0, i.e., a linear combination of the two parameters!]

Use the same basic procedure for forming a t statistic:

?1 ? ?2

t=

.

se ?1 ? ?2

(

)

AF6208: Econometrics Methods

Dr. Steven X. Wei

(

The key issue here is to derive se ?1 ? ?2

(

)

(

)

.

)

Var ( ? ? ? ) = Var ( ? ) + Var ( ? ) ? 2Cov ( ? , ? )

Since se ?1 ? ?2 = Var ?1 ? ?2 , we then consider

1

(

2

)

?

( )

2

1

( )

is an estimate of Cov ( ? , ? ) .

se ?1 ? ?2 =

where s12

1

2

2

? se ?1 ? + ? se ?2 ? ? 2s12

?

? ?

?

1

?

2

1

2

2

Note: In practice, you could obtain Cov ( ? , ? ) from any

econometric software used to run the regression.

1

2

AF6208: Econometrics Methods

Dr. Steven X. Wei

Re-parameterization (An alternative approach!):

Set ?1 = ?1 – ?2. Then solving it out for ?1, we have ?1 =?1 + ?2.

Substituting it into the original regression model:

y = ?0 + (?1 + ? 2 ) x1 + ? 2 x2 + … + ? K xK + ? .

Rearranging it yields,

y = ?0 + ?1 x1 + ? 2 ( x1 + x2 ) + ?3 x3 … + ? K xK + ? .

Now rerun the regression.

Testing H0: ?1 = ?2 is equivalent to testing H0 : ?1 = 0.

AF6208: Econometrics Methods

Dr. Steven X. Wei

This is to use re-parameterization first, and then do the test in an

easy way!

AF6208: Econometrics Methods

Dr. Steven X. Wei

Multiple linear restrictions

Everything weve done so far has involved testing a single linear

restriction, (e.g. ?1 = 0 or ?1 = ?2 );

However, we may want to jointly test multiple hypotheses about

our parameters;

A typical example is to test exclusion restrictions we want to

know if a group of parameters are all equal to zero;

AF6208: Econometrics Methods

Dr. Steven X. Wei

Multiple linear restrictions

Now the null hypothesis might be something like H0: ?k-q+1 =

0, … , ?k = 0;

The alternative is just H1: H0 is not true;

Cannot just check each t statistic separately, because we want

to know if the q parameters are jointly significant at a given

significance level, although it is possible for none to be

individually significant at that level.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Exclusion Restrictions

To do the test we need to estimate the restricted model without

xk-q+1,,
, xk included, as well as the unrestricted model with

all xs included;

Intuitively, we want to know if the change in SSR is big enough

to warrant inclusion of xk-q+1,,
, xk.

F ?

( SSRr

SSRur

? SSRur ) q

,

( n ? k ? 1)

where r is restricted and ur is unrestricted.

AF6208: Econometrics Methods

Dr. Steven X. Wei

The F statistic

The F statistic is always positive, since the SSR from the

restricted model cant be less than the SSR from the unrestricted;

Essentially the F statistic is measuring the relative increase in

SSR when moving from the unrestricted to restricted model;

q = number of restrictions, or dfr dfur; n k 1 = dfur.

To decide if the increase in SSR when we move to a restricted

model is big enough to reject the exclusions, we need to know

about the sampling distribution of our F stat.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Not surprisingly, F ~ Fq,n-k-1, where q is referred to as the

numerator degrees of freedom and n k 1 as the denominator

degrees of freedom.

AF6208: Econometrics Methods

Dr. Steven X. Wei

AF6208: Econometrics Methods

Dr. Steven X. Wei

The R2 form of the F statistic

Because the SSRs may be large and unwieldy, an alternative form of

the formula is useful;

We use the fact that SSR = SST(1 R2) for any regression, so can

substitute in for SSRr and SSRur:

F=

2

2

R

?

R

( ur r ) q

(1 ? R ) ( n ? k ? 1)

2

ur

.

AF6208: Econometrics Methods

Dr. Steven X. Wei

A special case of exclusion restrictions is to test H0: ?1 = ?2 =
= ?k

= 0 (while most of software would report the F-value for this null

hypothesis!);

Since the R2 from a model with only an intercept will be zero, the F

statistic is simply:

R2 k

F=

.

2

(1 ? R ) ( n ? k ? 1)

AF6208: Econometrics Methods

Dr. Steven X. Wei

General Linear Restrictions

The basic form of the F statistic will work for any set of linear

restrictions;

First estimate the unrestricted model and then estimate the

restricted model;

In each case, make note of the SSR;

Imposing the restrictions can be tricky will likely have to redefine

variables again.

AF6208: Econometrics Methods

Dr. Steven X. Wei

F Statistic Summary

Just as with t statistics, p-values can be calculated by looking up

the percentile in the appropriate F distribution;

Stata (package), for example, will do this by entering: display

fprob(q, n k 1, F), where the appropriate values of F, q,and n

k 1 are used;

If only one exclusion is being tested, then F = t2, and the p-values

will be the same.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Appendix 1

It is noted that ? follows a normal distribution N(0, ?2) implies that

the expected value of ?, is 0, i.e., E(?) = 0 and the variance of ? is

?2, i.e., Var(?) =?2.

Under the assumption, we can write

y | x1, x2 ,…xK N (?0 + ?1×1 + ?2 x2 + … + ?K xK ,? 2 )

Remark: You may ask why we make the normality assumption

of ?. Good question! Keep it in your mind and we will come back to

the issue again soon.

AF6208: Econometrics Methods

Dr. Steven X. Wei

AF6208: Econometrics Methods

Dr. Steven X. Wei

Normal Sampling Distribution

Under the CLR assumptions and the above normal assumption of ?,

we have the important result for the estimator of ?j:

(

? j N ? j ,Var (? j )

)

Standardized,

? j ??j

sd ( ? j )

N ( 0,1) .

AF6208: Econometrics Methods

Dr. Steven X. Wei

Understanding:

?j

is distributed normally, because it is a linear combination of

errors.

sd (? j ) = Var ( ? j ) =

?

SST j (1 ? R

2

j

)

.

If ?2 is known, then sd (? j ) is known. Then we can calculate

Z=

? j ??j

sd ( ? j )

which we can use to do a normal test.

AF6208: Econometrics Methods

Dr. Steven X. Wei

However, ?2 is usually unknown! We have to replace it by its

2

estimate ? instead:

( )

Var ? j =

?

2

SST j (1 ? R

2

j

)

,

where it is emphasized that the above is the estimate of Var ( ? j ) , rather

than Var ( ? j ) itself! Usually, let us repeat, we use its square root,

se( ? j ) = Var ( ? j ) =

?

SST j (1 ? R 2j )

.

AF6208: Econometrics Methods

We obtain:

t=

Dr. Steven X. Wei

? j ??j

se( ? j )

tn ?( K +1) .

Note: 1. This is a (Student) t-distribution (vs normal distribution)

2

2

because we have to estimate ? by ? . You should always remember

and be familiarize with this t-test statistic! You need the value

everywhere in your future studies, and readings of published papers

with regression analysis.

2. The degrees of freedom is n (K+1).

AF6208: Econometrics Methods

Dr. Steven X. Wei

Lecture Note 04

Linear Regression Model:

Asymptotics, Further Issues,

and Heteroscedasticity

AF6208: Econometrics Methods

Dr. Steven X. Wei

I. Asymptotic Properties [A bit technical!]

Recall: How to judge the performance of an estimator in finite

sample?

Two criteria (in finite sample):

(1) Unbiasedness

(2) Efficiency

In this section, we upgraded the two criteria into two new ones,

in large sample!

Two criteria (in large sample):

(1) Consistency

(2) Asymptotical Efficiency

AF6208: Econometrics Methods

Dr. Steven X. Wei

We start with the first criteria:

Under the Gauss-Markov assumptions, OLS estimators are

BLUE, but in other cases it wont always be possible to find

unbiased estimators (we will come to some examples in the

next topic!)

In those cases, we settle for estimators that are consistent,

meaning as n ? ? (i.e., the sample size is large), the

distribution of the estimator is shrunk to the (true) parameter

value.

Look at the intuition on the next page!

AF6208: Econometrics Methods

Dr. Steven X. Wei

Graphical understanding of consistency:

?

AF6208: Econometrics Methods

Dr. Steven X. Wei

Convergence in Probability

plim? n = ? .

Description (See the mathematical details in Appendix A):

Random variables ? n (think of it as an estimator of ? with sample size

n) converge in probability to a constant ? if the distributions of

? n become more and more concentrated on ? when sample size n is

getting larger and larger, and eventually collapse to the constant ?.

For convenience and convention, we write it mathematically as

plim? n = ? .

n ??

AF6208: Econometrics Methods

Dr. Steven X. Wei

Definition: An estimator ? n of ? is consistent if

plim? n = ? .

Example 4.1

The sample mean, ? n ( ? X ) , is a consistent estimator of

population mean ?. (Why?)

This is also known as the Law of Large Numbers!

In fact, a general result is sample moments approach

population moments under mild conditions important!

AF6208: Econometrics Methods

Dr. Steven X. Wei

If Xi and Zi, i=1,2,
n, are the samples of X and Z, respectively, then

we have:

Sample variance is a consistent estimator of population variance

1 n

plim n ? 1 ? ( X i ? X )2 = Var ( X ).

i =1

n ??

Usually, we write sx2

?x2.

p

Sample covariance ? Population covariance

1 n

plim n ? 1 ? ( X i ? X )(Zi ? Z ) = Cov( X , Z ).

i =1

n ??

Usually, we write sxY2

p

?xy2.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Sample moments ? Population moments, under mild conditions.

1 n r

plim n ? X i = E ( X r ).

i =1

n ??

One important property of consistency (or convergence in

probability) is: It is closed to any continuous functions!

What does it mean? Suppose h() is a continuous function (no

dis-continuity at any point!). If plim? n = ? , then p lim h(? n ) = h(? ).

Similarly, if plim? n = ? and plim? n = ? , then p lim h(? n + ? n ) = h(? + ? ).

This is a very useful and general result when you work on some

fancy problems in your dissertation or in your readings.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Consistency of OLS

Under the Gauss-Markov assumptions, the OLS estimator is

consistent (and of course it is unbiased too, as we have shown

before);

Consistency can be proved for the simple regression case in a

manner similar to the proof of unbiasedness?

Will need to take probability limit (plim) to establish consistency.

Note: You may not fully understand the mathematics, but you need

understand the thought and major results of consistency!

AF6208: Econometrics Methods

Dr. Steven X. Wei

Proving the consistency of ?1 in the simple linear regression

model y = ?0 + ?1 x + ? :

?1 =

? ( x ? x )y

?( x ? x )

i

i

2

Remember this!

i

(x

?

=

? x )( ? 0 + ?1 xi + ? i )

i

?( x

i

( x ? x )?

?

=

?( x ? x )

i

0

2

i

=

0

? x)

2

Substitute yi = ? 0 + ?1 xi + ? i

( x ? x )? x ? ( x ? x )?

?

+

+

?(x ? x )

?(x ? x )

( x ? x )?

?

+

?

+

?( x ? x )

i

1 i

2

i

i

i

2

i

i

1

i

i

2

AF6208: Econometrics Methods

1

x

?

x

?

(

)

?

i

i

x

?

x

?

(

)

?

i

i

n

?1 = ?1 +

=

?

+

1

2

2

1

x

?

x

(

)

? i

( xi ? x )

?

n

C ov( x, ? )

? ?1 +

= ?1.

Var ( x)

Dr. Steven X. Wei

AF6208: Econometrics Methods

Dr. Steven X. Wei

Deriving the inconsistency (similar to bias)

Just as we could derive the omitted variable bias earlier, now we

want to think about the inconsistency, or asymptotic bias, in this

case.

True model: y = ? 0 + ?1 x1 + ? 2 x2 + v

You think: y = ? 0 + ?1 x1 + u , so that

u = ? 2 x2 + v and, plim?1 = ?1 + ? 2?

where ? = Cov ( x1 , x2 ) Var ( x1 )

AF6208: Econometrics Methods

Dr. Steven X. Wei

Asymptotic Bias (i.e, Inconsistency!)

So, thinking about the direction of the asymptotic bias is just like

thinking about the direction of bias for an omitted variable;

Main difference is that asymptotic bias uses the population

variance and covariance, while bias uses the sample counterparts;

Remember, inconsistency is a large sample problem it doesnt

go away as you increase the sample size!

AF6208: Econometrics Methods

Dr. Steven X. Wei

Large Sample Inference (Main Idea)

Recall that under the CLM assumptions, the sampling

distributions (of the coefficient estimators) are normal, so we

could derive t and F distributions for testing;

This exact normality was due to assuming the population error

distribution was normal;

This assumption of normal errors implied that the distribution of

y, given the xs, was normal as well.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Large Sample Inference

Easy to come up with examples for which this exact normality

assumption will fail;

Any clearly skewed variable, like wages, savings, etc. cant be

normal, since a normal distribution is symmetric;

Normality assumption is not needed to conclude OLS is BLUE,

only for inference.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Central Limit Theorem

Based on the central limit theorem, we can show that OLS

estimators are asymptotically normal;

Asymptotic normality implies that P(Zn 0

Dr. Steven X. Wei

AF6208: Econometrics Methods

Example 4.2: Dummy Variable

Dr. Steven X. Wei

AF6208: Econometrics Methods

Dr. Steven X. Wei

Example for Dummy Variable

Compare it (on the last page) with the following estimated

model (p. 221):

wage = 7.10 ? 2.51 female

(.21)

(.30)

n = 526, R = .116

2

What can you learn from the different results of the two models?

AF6208: Econometrics Methods

Dr. Steven X. Wei

Dummies for Multiple Categories

We can use dummy variables to control for something with

multiple categories;

Suppose everyone in your data is either a HS dropout, HS

grad only, or college grad;

To compare HS and college grads to HS dropouts, include 2

dummy variables;

hsgrad = 1 if HS grad only, 0 otherwise; and colgrad = 1 if

college grad, 0 otherwise.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Multiple Categories (cont)

Any categorical variable can be turned into a set of dummy

variables;

Because the base group is represented by the intercept, if

there are n categories there should be n 1 dummy

variables;

If there are a lot of categories, it may make sense to group

some together (practically);

Example: top 10 ranking, 11 25, etc.

AF6208: Econometrics Methods

Dr. Steven X. Wei

Example 4.3: Multiple Categories

Test (monthly) seasonality in stock returns

Each monthly dummy variable has a value of 1 when the

month occurs and a value of 0 for the other months.

Returnst = .0301 + .0003 Jant ? .0111Febt + ….. ? .0059 Novt

(.0166) (.0176)

n = 288, R 2 = .0574.

(.0164)

(.0164)

AF6208: Econometrics Methods

Dr. Steven X. Wei

Interactions among Dummies

Interacting dummy variables is like subdividing the group;

Example: have dummies for male, as well as hsgrad and colgrad;

Add male*hsgrad and male*colgrad, for a total of 5 dummy

variables > 6 categories;

Base group is female HS dropouts;

hsgrad is for female HS grads, colgrad is for female college grads;

The interactions reflect male HS grads and male college grads.

AF6208: Econometrics Methods

Dr. Steven X. Wei

More on Dummy Interactions

Formally, the model is y = ?0 + ?1male + ?2hsgrad + ?3colgrad

+ ?4male*hsgrad + ?5male*colgrad + ?1x + ?,

If male = 0 and hsgrad = 0 and colgrad = 0

y = ?0 + ?1x + ?

If male = 0 and hsgrad = 1 and colgrad = 0

y =?0 + ?2hsgrad + ?1x +?

If male = 1 and hsgrad = 0 and colgrad = 1

y = ?0 + ?1male + ?3colgrad + ?5male*colgrad + ?1x + ?

AF6208: Econometrics Methods

Dr. Steven X. Wei

Other Interactions with Dummies

Can also consider interacting a dummy variable, d, with a

continuous variable, x

y = ?0 + ?0d + ?1x + ?1d*x + ?

If d = 0, then y = ?0 + ?1x + ?

If d = 1, then y = (?0 + ?0) + (?1+ ?1) x + ?

This is interpreted as a change in the slope if ?1 ? 0

(as well as a change in the intercept, if ?0 ? 0)

AF6208: Econometrics Methods

Example of ?0 > 0 and ?1 < 0
Dr. Steven X. Wei
AF6208: Econometrics Methods
Dr. Steven X. Wei
Testing for Differences across Groups
Testing whether a regression function is different for one
group versus another can be thought of as simply testing for
the joint significance of the dummy and its interactions with
all other x variables.
So, you can estimate the model with all the interactions and
without and form an F statistic, but this could be unwieldy.
AF6208: Econometrics Methods
Dr. Steven X. Wei
The Chow Test
Turns out you can compute the proper F statistic without
running the unrestricted model with interactions with all k
continuous variables;
If run the restricted model for group one and get SSR1, then
for group two get SSR2;
Run the restricted model for all to get SSR, then:
?? SSR ? ( SSR1 + SSR2 )?? ??n ? 2 ( k + 1)??
F=
*
.
SSR1 + SSR2
k +1
AF6208: Econometrics Methods
Dr. Steven X. Wei
The Chow Test (continued)
The Chow test is really just a simple F test for exclusion
restrictions, but we have realized that SSRur = SSR1 + SSR2;
Note, we have k + 1 restrictions (each of the slope
coefficients and the intercept);
Note the unrestricted model would estimate 2 different
intercepts and 2 different slope coefficients, so the df is n
2k 2.
AF6208: Econometrics Methods
Dr. Steven X. Wei
IV. Heteroskedasticity
Recall the assumption of homoskedasticity implied that
conditional on the explanatory variables, the variance of the
unobserved error, u, was constant;
If this is not true, that is if the variance of u is different for
different values of the xs, then the errors are
heteroskedastic;
Example: Estimating returns to education and ability is
unobservable, and think the variance in ability differs by
educational attainment.
AF6208: Econometrics Methods
Example of Homoskedasticity
Dr. Steven X. Wei
AF6208: Econometrics Methods
Regressions with Homoskedasticity
Dr. Steven X. Wei
AF6208: Econometrics Methods
Example of Heteroskedasticity
Dr. Steven X. Wei
AF6208: Econometrics Methods
Regressions with Heteroskedasticity
Dr. Steven X. Wei
AF6208: Econome