ECON 120C UCLA Econometrics C Problems

Description

5 attachmentsSlide 1 of 5attachment_1attachment_1attachment_2attachment_2attachment_3attachment_3attachment_4attachment_4attachment_5attachment_5.slider-slide > img { width: 100%; display: block; }
.slider-slide > img:focus { margin: auto; }

Unformatted Attachment Preview

Problem Set 2, Econ 120C
Yixiao Sun
Question I Read the following Stata program
clear
cap postclose tempid
postfile tempid beta reject1 reject2 ///
using mydata.dta,replace
forvalues i = 1(1)1000 {
drop _all
quietly set obs 1000
gen x = rnormal()
gen e = rnormal()
gen u = x + e
gen y = 3*x + u
quietly reg y x, r
scalar beta = _b[x]
qui
sca
qui
sca
test x = 3
reject1= (r(p) < 0.10) test x = 4 reject2 = (r(p) < 0.10) post tempid (beta) (reject1) (reject2) } postclose tempid use mydata.dta, clear sum Part of the output is given in the table below: Variable Mean beta reject1 r1 reject2 r2 Answer the following questions with detailed arguments. (a) What would you expect the value of to be? (b) What would you expect the value of r1 to be? 1 Std. Dev. # 1 2 (c) What would you expect the value of r2 to be? (d) What would you expect the value of 1 to be? (e) What would you expect the value of 2 to be? Question II Consider the linear regression model Yi = + Xi + Wi + ui Given the sample (Xi ; Wi ; Yi )ni=1 ; we run the following regressions (i) Regress Y on X and an constant/intercept. Denote the OLS estimator of the slope on X as ^ short : (ii) Regress W on X and an constant/intercept. Denote the OLS estimator of the slope on X as ^: (iii) Regress Y on X and W and an constant/intercept. Denote the OLS estimators of ; ; by ^ long ; ^ long and ^ long and de…ne u ^i = Yi (^ long + Xi ^ long + Wi ^ long ): ^ (^ By construction, we know that E u) = 0; cov c (X; u ^) = 0 and cov c (W; u ^) = 0: (a). Write down the formulae for ^ short and ^: (b). Show that ^ ^ ^ short = long + ^ long : Hint: Plug the de…nition Yi = ^ long + Xi ^ long + Wi ^ long + u ^i into the formula for ^ short and then use the operating rules for sample covariances to simplify. Question III In a 1994 research paper, two economists examined the impact of looks on earnings using interviewers’ ratings of respondents’ physical appearance. They used data in which respondents reported their wages, and the interviewers rated the respondent’s appearance, using …ve categories: (1. homely, 2. quite plain, 3. average, 4. good looking and 5. strikingly handsome or beautiful). Download the …le beauty.xls from the course web page. The …le contains the following variables: hourly earning, looks, female (is equal to 1 if female, 0 otherwise) and years of education. Assume that a population consists of all individuals in the data set. (a) Estimate the following regression ernhr = + looks + yrseduc + u (1) using the population data (i.e., use all data). What is the value of (the estimated) ? Denote this value as 0 : (b) (i) Draw a simple random sample of size 100. (ii) Estimate (1) using the sample you obtained. (iii) Test the null hypothesis H0 : = 0 against H1 : 6= 0 using = 10% as the size of the test. Let reject be a dummy variable indicating whether H0 is rejected (reject=1 if H0 is rejected). (iv) Save ^ OLS , its robust standard error, and the dummy variable reject to a Stata data set, say mydata.dta. The data set contains three variables, say beta_hat, se_hat, reject (You can use your own favorite names for the data set and the variables). 2 (c) Repeat (b) 1000 times. (d) Now load the Stata data set mydata.dta into Stata and graph the histogram of beta_hat. Is it close to be normal? (e) Summarize all variables. Is the mean of reject close to be 10%? Is your answer expected? Can you explain why or why not? (f) Is the standard deviation of beta_hat close to the mean of se_hat? Is your answer expected? Can you explain why or why not? Note: A sample Stata program is posted on the course home page. You are encouraged to write your own program. The sample program focuses on instead of and so you have to modify it to accommodate this and other di¤erences. This question is designed to help you understand the sampling distribution of the OLS estimator. It is also a very good practice for Stata programming. 3 ERNHR LOOKS FEMALE YRSEDUC 5.725481 4 1 14 4.284616 3 1 12 7.963461 4 1 12 11.57033 3 0 16 11.41827 3 0 16 3.90625 2 1 12 8.760684 3 0 16 7.692307 4 0 16 5 2 1 16 3.894231 2 1 12 3.445513 2 1 12 4.030769 2 0 16 5.137362 2 0 17 2.999231 2 0 16 7.988166 4 0 16 6.009615 4 0 16 5.164835 3 0 17 11.53846 4 0 17 10.43956 4 1 17 7.692307 3 1 16 7.692307 4 0 15 6.793334 4 0 14 6.868132 4 0 12 17.03297 2 0 17 10.05245 2 0 13 15.81197 2 0 13 14.83517 2 0 13 19.08007 4 0 17 8.345428 3 0 13 9.615385 3 0 12 5.961538 2 0 10 5.725481 2 0 10 6.730769 3 0 10 8.173077 4 0 13 12.39316 4 0 12 9.615385 3 0 13 4.825175 2 0 10 7.692307 3 0 9 12.5 4 0 11 2.821906 3 0 9 12.30769 4 0 17 13.22115 4 0 16 16.82692 4 0 16 11.53846 3 0 16 4.945055 3 1 9 7.211538 3 0 11 11.58371 3 0 12 15.38461 3 0 17 6.102071 2 0 11 7.932693 3 0 17 9.157509 4 1 17 6.912682 2 1 17 20.18084 3 0 17 9.925558 3 0 17 7.788462 4 0 13 5.128205 3 0 10 10.98901 4 0 16 6.555944 2 0 13 10.09615 3 0 17 10.73601 3 0 16 16.34615 3 0 17 4.615385 2 1 12 7.489879 3 0 17 19.23077 3 0 17 4.771234 2 1 17 7.142857 2 1 17 14.42308 4 0 17 9.278846 4 0 17 29.98297 4 0 17 32.79387 4 0 16 9.615385 3 0 16 8.361204 4 1 16 21.63461 4 0 17 15.90006 4 0 17 5.769231 2 0 10 20.98808 4 0 17 23.32009 4 0 17 4.965035 2 0 11 6.888634 2 0 12 5.725481 3 1 13 5.192307 3 0 13 7.79359 3 1 17 8.791209 3 1 16 9.317307 4 1 17 8.333333 3 1 17 10.12404 4 1 13 1.975481 2 1 8 8.241758 4 1 17 5.048077 2 1 14 6.552707 3 1 16 4.471154 3 0 10 7.963461 3 1 16 6.25 2 0 17 4.134615 3 0 16 6.730769 3 1 13 7.692307 3 1 17 4.349817 2 1 14 7.692307 3 0 14 4.807693 2 0 9 5.244755 3 0 9 2.142857 2 0 8 4.487179 3 0 10 3.605769 2 0 10 5.448718 2 0 12 5 4 0 13 6.730769 3 0 12 10.05128 3 0 17 6.538462 4 0 12 9.203671 4 1 17 4.615385 3 1 16 3.365385 4 1 13 5.53613 2 1 17 6.10718 3 1 12 4.990385 3 1 13 5.879121 3 1 17 5.659616 4 1 16 3.998974 3 0 16 6.868132 3 1 16 4.373024 2 1 17 5.769231 3 0 13 5.769231 4 0 14 12.30769 4 0 17 19.08007 2 0 17 7.554945 4 1 17 5.90035 3 1 17 6.370769 3 1 17 6.153846 4 0 17 1.923077 2 1 17 3.269231 3 0 17 2.179487 4 0 17 5.616606 3 0 17 9.294871 4 1 17 6.410256 3 1 17 7.45842 3 1 16 6.586538 2 1 17 6.730769 2 1 17 5.608974 3 1 16 3.653846 4 1 16 13.01282 4 0 17 6.968326 3 0 17 7.692307 3 1 17 6.730769 3 0 16 5.128205 3 1 17 2.136752 4 1 16 4.529914 2 1 14 9.101099 2 1 17 9.78022 3 1 17 3.634615 2 1 13 7.211538 4 1 17 10.61795 4 1 17 4.615385 2 1 17 6.538462 2 1 14 12.11539 4 1 17 7.211538 4 1 16 6.25 4 1 16 3.846154 2 1 17 3.030769 3 1 16 2.010489 2 1 16 3.895105 2 1 17 6.346154 3 1 17 2 2 1 14 7.692307 4 0 17 2.633974 2 1 12 5 3 1 17 6.543406 3 1 17 5.192307 3 1 16 9.745421 4 1 16 2.808989 2 1 13 4.615385 3 0 17 7.326007 3 0 17 7.692307 4 0 17 3.259615 3 0 17 5.769231 2 1 17 5.594406 3 0 10 5.128205 3 0 10 14.74359 4 0 17 3.671329 3 1 9 9.731935 4 0 19 3.461539 3 0 9 6.370769 4 0 12 6.581197 2 1 12 10.14957 4 0 16 9.230769 3 0 17 6.749359 3 0 11 10.21635 3 1 17 4.759615 3 0 9 1.755983 3 1 17 12.73389 4 0 17 5.85 4 0 10 3.173077 4 0 9 5.994231 4 0 11 4.615385 3 0 9 1.795892 4 0 9 5.952381 3 0 11 7.45633 3 0 14 8.699634 4 0 15 3.749038 2 0 9 7.692307 3 0 14 5.244755 3 0 11 10.09615 2 0 16 Causality and Causal Model: Bivariate Case ©Yixiao Sun 1 Causality and Causal Model: Definition ? There are many ways to define causality and causal models ? I present one of many possibilities here. ? Let us first consider the simple case with only two scalar variables y and x. ? Both variables are deterministic. ? We want to define the notion “x causes y” ? Either there are no other variables or other variables do not interfere the relationship between x and y. ©Yixiao Sun 2 SET x, FORCE x, INTERVENE x ? Our definition involves the following conceptual steps: 1. We (as the experimenter) set x to each of its possible values (or force x to take each of its possible values or intervene and let x take each of its possible values ) 2. We let y respond freely without any further intervention, direct or indirect. 3. We observe that y takes a unique value for each setting of x (setting: a particular value that x takes) 4. We ask: does y take different values for different x? If yes, then x causes y. Otherwise, x does not cause y. ©Yixiao Sun 3 Causality and Causal Model: Definition ? Let c(x) be the unique value of y for each x. ? We call c(x) the response function. ? Mathematically, if c(x) is not a constant function, then we say that x causes y. In this case, we call c(x) the causal function. ? When x causes y, we write y ? c(x). Here we use an arrow “?” instead of the equal sign “=” to indicate the causality and causal direction; that is, x is set and y is free to respond (x is the cause, y is the effect). ©Yixiao Sun 4 No Intervention, No Causality ? We call a variable a settable variable if we can intervene at will to set it to any desired value. ? Some variables are not settable: race, … ? We call the variable a “free variable” if we do not intervene to set its value. ? Here x is a settable variable and y is a free variable. ? The importance of the formal role played here by intervention/setting cannot be over-emphasized. ? The notion of cause and effect we adopt here will have meaning only in the context of intervention, whether actual or merely hypothetical. ©Yixiao Sun 5 Causality and Causal Model: Notation ? A main source of confusion in causal inference is the bad notation. ? Instead of using “y ? c(x)”, we almost always use “y = c(x)”. ? The notation “y ? c(x)” signifies that the lhs and rhs are not exchangeable. “causes” are different from “effects.” ? The math equality “y = c(x)” encodes the exchangeability of the two sides: “y = c(x)” iff “c(x) = y.” ? We use the new and better notation only in this and few future lectures, as the conventional notation has been so ingrained in the literature. ©Yixiao Sun 6 Example 1 ? y : earnings/wages; x : years of schooling Assume that only years of schooling matter for earnings and nothing else matters so that we have a two-variable system. This, of course, is an abstraction of the reality. ? To see whether x causes y, we (the experimenter) force an individual to have 8, 9, 10 years of schooling and observe the corresponding earnings. ? We ask: do different years of schooling lead to different earnings? ? The answer is most likely “yes”. So x causes y. ©Yixiao Sun 7 Example 2 ? y : 1 or 0 indicating whether it will rain or not x : percentage of individuals carrying an umbrella. ? To see whether x causes y, we (the experimenter) set the percentage at different levels by forcing individuals to carry or not carry an umbrella. We examine whether the weather will change in response to the percentage of individuals carrying an umbrella. ? Key: individuals can not make their own decisions on whether to carry an umbrella or not. We set their x’s. ? In this example, clearly y will not change. ? So x does not cause y in this example. ©Yixiao Sun 8 Example 2 (continued) ? y : 1 or 0 indicating whether it will rain or not x : percentage of individuals carrying an umbrella. ? If we do not set the value of their x and let individuals make their own decisions (based on whatever information they may have) to carry an umbrella or not (we become passive observers instead of active experimenters), then x can be useful to predict y. ? Here x does not cause y, but in an observational study (instead of an experimental study) x can be very useful as a predictor for y. ©Yixiao Sun 9 Causality vs Prediction ? Causality and Prediction are fundamentally different. • Causality: how things actually work. We care about the physical, chemical, biological, and economic laws. We want to understand the “causal structure.” We often refer to a causal model as a “structural model.” • Prediction: how things are related, associated, move together. We do not care about the physical, chemical, biological or economic laws. As long as two variables move together, we can use one variable to predict the other. • The measure for association or co-movement is: correlation coefficient (at least in the linear case). • Correlation coefficient is purely a statistical object. It does not have to contain any physical, chemical, biological, or economic meaning. ©Yixiao Sun 10 Causality and Causal Model: Multivariate Case ©Yixiao Sun 11 Causality and Causal Model: a definition ? Our definition of causality and causal model can be extended to a multivariate case. ? Consider the special case that x = (xf , xo) where xf is a scalar variable and is the focus of interest and xo consists of all other variables. ? In our returns-to-education example, we can let xf be the years of schooling and xo be the innate ability. ? We want to assess whether x causes y. ©Yixiao Sun 12 SET x, FORCE x, INTERVENE x ? We repeat the same definition: 1. We (as the experimenter) set x to each of its possible values (or force x to take each of its possible value, or intervene and let x take each of its possible values) 2. We let y respond freely without any further intervention, direct or indirect. 3. We observe that for each setting of x, y takes a unique value c(x) = c(xf , xo). 4. We ask: is c(x) a constant function? If yes, then x does not cause y. Otherwise, x causes y. ©Yixiao Sun 13 Question: does xf cause y? ? If there is a setting, say x ?o , such that the function defined by c ?o ?x f ? ? c?x f , x ?o ? is not a constant function, then we say that xf causes y under the setting x o ? x ?o. x ?o ? In this case, we write y ? x f ? If the above function is a constant function for all values of x o , then we say that xf does not cause y. ©Yixiao Sun 14 Example: does xf cause y? In our returns-to-education example, xf is the years of schooling and xo is the innate ability. We ask: do earnings change in response to years of schooling for some level of innate ability? If earnings do not change in response to years of schooling for any level of innate ability, then years of schooling does not cause earnings. If earnings DO change in response to years of schooling for some level of innate ability, then years of schooling causes earnings for this level of innate ability. ©Yixiao Sun 15 Question: does xf cause y? ? To address this question, we have to keep all other variables at some level, say x ?o , and examine whether y will change in response to different settings of xf . ? Sometimes we refer to xo as the background variable. ? We keep xo at a given level so that it will not confound the causal relationship between xf and y. However, the strength of the causal link between xf and y may depend on the value of xo ? For one setting of xo, xf causes y; For another setting of xo, xf does not cause y. Causality and its magnitude established for one sample may be not applicable to the population (the problem of external validity). ©Yixiao Sun 16 ceteris paribus effect ? Given this definition, we can define a notion of “ceteris paribus effect”, that is, the effect of one variable holding all others equal. ? Ceteris paribus is a Latin phrase meaning “other things equal”. ? When xf is a continuous variable and c is differentiable, this effect is defined to be ??x f , x o ? ? ? If xf is discrete, we define ?c?x f ,x o ? ?x f ? ? ?x f , x o ? ? c?x f ? ?, x o ? ? c?x f , x o ? ©Yixiao Sun 17 Example ? Demand curve: q ? c(p,o) where p is the price, o consists of all other factors, q is the quantity demanded. ? The familiar demand curve describes an economic law. It traces the quantities demanded for all possible values of prices. Some of the prices are actually observed in the market and others may be not. It is a structural model that describe the behavioral of the consumers. ? If the price of beef increases— ceteris paribus— the quantity of beef demanded by buyers will decrease. ? What is in “o”? ? price of substitute goods (pork, lamb, …) ? consumer’s preference (societal shift toward 18 vegetarianism) Interpretation ? If the causal relationship is linear (a big assumption), then we have y ? ?? ? x f ? ? x o ? ? The linear causal model says that if we change xf by 1 unit, then y will change by ? units, all else being equal. ? I can not stress enough how important the “all else being equal” condition is. Under this condition, we can think that the change in xf is induced or set by us, that is, we intervene and force xf to change 1 unit but keep all else constant. ©Yixiao Sun 19 Causality and Causal Model: Bring Models to Data and Linear Causal Model ©Yixiao Sun 20 Bring Causal Models to Data ? We need to connect the deterministic causal model with our data, which are often observational. ? Suppose that the values of x = (xf , xo) we want to set are iid draws from a certain distribution. That is, the settings of x are: ?X i ? ?X fi , X oi ?? ? Either we (as the experimenter) pick these values from a distribution or we draw these values from a population (sometimes Nature draws these values for us) ©Yixiao Sun 21 Bring Causal Models to Data .. .. . . . .. . . . . . . .. .. . . . . . . .. Sampling X 1 ? ?X f1 , X o1 ? X 2 ? ?X f2 , X o2 ? ... X n ? ?X fn , X on ? POPULATION ©Yixiao Sun 22 Bring Causal Models to Data ? Let Yi be the realized value of y in the absence of any intervention for y. We have: Yi ? c?X fi , X oi ? or mathematically Yi ? c?X fi , X oi ? X 1 ? ?X f1 , X o1 ? ? Y1 X 2 ? ?X f2 , X o2 ? ? Y2 ... X n ? ?X fn , X on ? ? Yn ? The causal relation is deterministic, but the settings of the causal factors are stochastic and hence the outcome of interest is stochastic. ©Yixiao Sun 23 Linear Causal Model ? In the case that the causal function is linear, we have Yi ? ?? ? X fi ? ? X oi ? ? Now suppose X oi is not observed, we have Yi ? ?? ? E?X oi ?? ?X fi ? ?X oi ? ? E?X oi ?? ui ? ? ? ? X fi ? ? u i where ? ? ?? ? E?X oi ?? and u i ? X oi ? ? E?X oi ?? ? Here ui captures the effect of all centered and unobserved causal factors. ©Yixiao Sun 24 Linear Causal Model Yi ? ?? ? E?X oi ?? ?X fi ? ?X oi ? ? E?X oi ?? ui ? ? ? ? X fi ? ? u i ? We have gone a long way in order to obtain a linear causal model, which may look familiar. ? We should not take the simple linear causal model for granted: there are important assumptions underlying the model, for example, the linear and separable form of the causal relationship. • Linearity: the effect X fi ? is linear in X fi • Separability: X fi ? and ui are additively separable ©Yixiao Sun 25 Nonlinear and Non-separability ? To understand linearity and separability, we consider two counter examples. ? Example 1: a nonlinear but separable relationship: Yi ? ? ? X fi ? ? X 2 ?? ? u i fi ? Example 2: a nonlinear and non-separable relationship: Yi ? ? ? X fi ? ? X 2 ? u i ?? ? u i fi ? What is the ceteris paribus causal effect in each case? ? Example 1. The effect depends on xf but not on xo or u ??x f , x o ? ? ?c?x f ,x o ? ?x f ? ? ??x f ??x 2f ?? ?u ?x f ? ? ? ? 2x f ? ? Example 2. The effect depends on both xf and xo (i.e., u). ??x f , x o ? ? ?c?x f ,x o ? ?x f ? ? ??x f ?? x 2f ?u ?? ?u ?x f ©Yixiao Sun ? ? ? 2x f ? u ? ?? 26 Linear Causal Model Yi ? ?? ? X fi ? ? X oi ? ??x f , x o ? ? ?c?x f ,x o ? ?x f ? ??x f , x o ? ? ???? ?x f ??x o ?? ?x f ? ?. Linearity and separability imply that the causal effect is a constant. It is the same for all individuals. This is an important restriction. ©Yixiao Sun 27 Prediction Analysis versus Causal/Structural Inference Yixiao Sun 1 Predictive Model: Review I Given two random variables (X , Y ) , suppose we want to predict Y based on X . I The starting point of a linear predictive model is to de…ne ?1 ?0 cov (X , Y ) and var (X ) = EY (EX ) ? . = I These are well de…ned as long as var (X ) < ? and var (Y ) < ?. I Note that these are purely statistical objects. They may not contain any physical, chemical, biological, or economic meaning. 2 Predictive Model: Review I With these de…nitions, we de…ne e to be the di¤erence between Y and the linear function ?0 + X ?1 : e=Y I ( ?0 + X ?1 ) . I want to emphasize that this is just a mathematical de…nition. The mathematical equation can be rewritten as Y = ( ?0 + X ?1 ) + e. I We add whatever it should be to bring ?0 + X ?1 to Y . I The added amount may not represent any real e¤ect. 3 Passive Prediction: Prediction I Passive Prediction is the use of predictive analytics to make predictions based on the data for which no predictor is exogeously changed. That is, the data are passively observed. I Example: Predict whether a visitor to a website will make a purchase, based on the observed browsing behavior within that website. I In making her prediction, the analyst assesses the relationship between a purchase and observed information only; she does not consider how purchasing behavior might be a¤ected if the visitor’s browsing experience were changed, e.g., by changing a banner ad to an automatic pop-up. 4 Passive Prediction I The analyst may have data on visits by many individuals, and use those data to establish a predictive relationship between browsing behavior and purchasing (e.g., by estimating a regression model). I Then using the predictive relationship, she can make predictions for a given visitor based on his observed browsing behavior. I She can also make predictions based on hypothetical browsing behavior (i.e., what if a visitor browse page 2 for 30 seconds, page 5 for 45 seconds, etc.?) 5 Passive Prediction vs Pattern Discovery I When used to make predictions, pattern discovery (pattern recognition) is generally used for passive prediction. I Conceptually, if we discover a distinctive patterns among a set of variables in a given dataset, we would expect that this pattern will emerge again when the same variables are collected without interference. 6 Passive Prediction vs Pattern Discovery I For example, we may discover the following pattern: whether a student graduates from high school is related to his mother’s level of education. I Then we can use the observed level of education for a student’s mother to predict whether the student will graduate from high school. I The above problem is fundamentally di¤erent from predicting the outcome when we exogenously change the level of education the mother attained. 7 Passive Prediction vs Pattern Discovery I For example, we may discover the following pattern: whether a student graduates from high school is related to his mother’s level of education. I Then we can use the observed level of education for a student’s mother to predict whether the student will graduate from high school. I The above problem is fundamentally di¤erent from predicting the outcome when we exogenously change the level of education the mother attained. 8 Passive prediction: Econ120C performance I I collected some data from my past Econ120C students I I I Xi : Econ120B Professor, Econ120B Grade, Hours of study per week, % of class attended live, Time spent on class webpage Yi : weighted score Passive prediction: I I I Draw a student randomly from the class. Reveal only his/her X Predict his/her Y 9 Causal Model I For a linear causal model Y ?+X?+u where u stands for other and possibly unobserved causal factors. I Interpretation of ? : If we intervene and set X to change by 1 unit while keeping all elseconstant, then Y will change by ? units. I The di¤erence between ? and ? lies in whether all else has been kept as equal. 10 Causal Model: Active Prediction I If we want to predict the consequence of some action on an outcome of interest, we need to establish a causal or structural relationship. 11 Causal Model: Active Prediction Example 1 I You may want to predict the impact on your …nal grade in Econ120C if you increase your study time by one hour per week. I Implicitly, you assume that all else have been kept as equal. I So you are making an active prediction. 12 Causal Model: Active Prediction Example 2 I Suppose you care about your body weight two months from now (call this Y ). I Currently, you do not eat whole grains but are considering switching to a whole-grain-only diet (call this change in diet X ). I Then you may want to make an active prediction of Y based on X . 13 Causal Model: Active Prediction 3 I Do you notice music playing in retail stores? I Studies show that music can signi…cantly in‡uence sales. I Actively vary music tempo played in stores from very slow to quick and observe the results on sales. I Slower music causes shoppers to shop more slowly, resulting in higher sales. 14 A comparison Show that the two models coincide if cov (X , u ) = 0 15 A comparison Model Correlation Interpretation of the slope Model Correlation Interpretation of the slope Predictive Analysis Y = ? +X? +e By construction cov (X , e ) = 0 other variables run their own course all else may not be equal Causal Inference Y = ?+X?+u Cov (X , u ) may not be zero. all other variables kept constant 16 Example 1: No causality does not imply no predictability Suppose that we have the following simple casual relations: y z a x z b for b 6= 0. Graphically, z x . & y Let z be generated as a sequence of iid random variables Zi so that in the absence of intervention for x and y we observe Xi = Zi b, Yi = Zi a= a Xi . b 17 Example 1: No causality does not imply no predictability 18 Example 1: No causality does not imply no predictability I For the purpose of this example, we assume that we do not observe Zi0 s. I Our observations consist of (Xi , Yi ) lying on the line y = (a/b ) x. I Given any Xi , the best prediction of Yi is m (Xi ) = (a/b ) Xi . I Thus Xi is useful for predicting Yi , even though there is no causal relation between Xi and Yi . Furthermore, the regression coe¢ cient a/b de…nitely does not measure the e¤ect on y caused by a change in x. I Intervening to change x (while keeping z constant) has no e¤ect on y . Instead, the regression coe¢ cient a/b works together with Xi to give an optimal prediction of Yi . 19 Example 1: No causality does not imply no predictability I I I In any equation system, like the one above, if we intervene a variable (say x ), then the equation that determines this variable has to be crossed out. The equation does not describe how x is determined any more. In this example, the system becomes x x0 (set ) y az Graphically, x0 # x z & . y x and y are not connected in any way: the causal e¤ect is zero. The cause e¤ect is (when x is set at two di¤erent values x0 , x00 ): y (x00 ) y (x0 ) = 0 20 Example 2: Causality does not imply predictability Consider the following causal system y ax + u, x by + v , or graphically v # x $ u # . y Example: x : crime rate; y : police spending. (The two di¤erent causal directions may not happen at exactly the same time, but if we observe the variables not very frequently, then y ax + u and x by + v can be regarded as happening simultaneously over each observation interval). 21 Example 2: Causality does not imply predictability I Suppose that the values of (u, v ) are generated as an iid sequence of pairs (Ui , Vi ) such that (Ui , Vi ) s N 0, ?uu ?uv ?uv ?vv . I We do not observe f(Ui , Vi )g , and we observe f(Xi , Yi )g only. I The reduced form (the equilibrium solution in terms of Ui and Vi ) is given by Xi = Yi = 1 1 ab 1 1 ab (bUi + Vi ) , (Ui + aVi ) . 22 Example 2: Causality does not imply predictability I It is now easy to show that ? = cov (Xi , Yi ) b?uu + (1 + ab ) ?uv + a?vv . = var (Xi ) b 2 ?uu + 2b?uv + ?vv I Su¢ cient freedom exists to deliver a wide range of possible value for ? . I For example, when ?uu = 0, we have ? = a?vv =a ?vv whereas if ?vv = 0, we have ? = 1/b. I Picking ?uv = b?uu + a?vv 1 + ab gives ? = 0 so that Xi is useless as a predictor of Yi 23 Remarks The optimal linear prediction interpretation of ? holds regardless of whether we have each of the following (a) x is the cause of y (?uu = 0) (b ) y is the cause of x (?vv = 0) (c ) x and y are mutually non-causal, although both have a common cause (y za and x zb ) (d ) x and y mutually cause each other in the presence of additional causal variables (?uu 6= 0, ?vv 6= 0) In the …rst three cases, the predictions are in fact perfect, while in the last case we can have Xi and Yi useless as predictors of one another despite their causal relationships. 24 Remarks I While in case (a): x ! y , the conditional mean coincides with the causal function, this is not true in any of other cases. I This conditional expectation cannot by itself tell us what we should expect to happen when we intervene to set Xi to a particular value. I Rather, it predicts: it tells us what we can expect Yi to be given Xi when Yi and Xi are generated by whatever process is operational for observation i. 25 Purchase answer to see full attachment Tags: Econometrics Linear Prediction Prediction Analysis User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.