计量经济学PPT-5.1
If corr(x2 , x1 ) and β2 have the same sign, bias will be positive. If corr(x2 , x1 ) and β2 have the opposite sign, bias will be negative. The more general Case: Technically, can only sign the bias for the more general case if all of the included x’s are uncorrelated.
2 / 23
Multiple Regression Analysis
Omission of Relevant Variables
Suppose we omit a variable that actually belongs to the true (population) model Example: Suppose the true model is wage = β0 + β1 educ + β2 ability + u (1)
3 / 23
Multiple Regression Analysis
Omission of Relevant Variables
Suppose that we know that the model is y = β0 + β1 x1 + β2 x2 + u, where E(ujx1 , x2 ) = 0 but we do not include x2 in the regression. As it was shown before ∑n (xi1 ˜ β 1 = i=1 ∑n 1 (xi1 i= Then conditional on the regressors ˜ E( β1 ) = β1 + β2 Sx1 ,x2 . S21 x ¯ x1 ) yi ¯ x1 )2 .
1
The OLS estimator is unbiased ˆ E( βj ) = βj , j = 0, 1, 2
2
ˆ However, the OLS estimators βj are no longer efficient and thus the estimators are no longer BLUE. Confidence intervals tend be wider and thus the coefficients are not estimated as precisely as if we had used the correct model.
where Sx1 ,x2 is the sample covariance between x1 and x2 and S21 x is the sample variance of x1 .
4 / 23
Multiple Regression Analysis
Omission of Relevant Variables
8 / 23
ቤተ መጻሕፍቲ ባይዱ
Multiple Regression Analysis
The Inclusion of Irrelevant Variables
Suppose we include in the regression model an irrelevant variable (an independent variable that has no partial effect on the dependent variable in the population). That is, the population (true) coefficient is zero. The inclusion of an irrelevant variable in the model means that we overfit or overspecify the model. For example, suppose we estimate the following model y = β0 + β1 x1 + β2 x2 + u, where x2 has no effect on y, controlling for x1 i.e: β2 = 0. Since x2 is an irrelevant variable, the true (population) model is y = β0 + β1 x1 + u, The variable x2 may or may not be correlated with x1 .
Sx ,x ˜ E( β1 ) = β1 + β2 1 2 . S21 x Summary of Direction of Bias corr(x2 , x1 ) > 0 corr(x2 , x1 ) < 0 β2 > 0 Positive Bias Negative Bias β2 < 0 Negative Bias Positive Bias
9 / 23
Multiple Regression Analysis
The Inclusion of Irrelevant Variables
What is the effect of the inclusion of the irrelevant variable x2 in the model,when actually its coefficient in the population model is zero?
where ability is not observed since it is difficult to measure it (we typically do not know the IQ of each individual) Thus we omit it. What is the effect of the omission of ability on the OLS estimator? If this omission leads to a distorted value of the estimate of β1 ; we might wrongly overestimate/underestimate the impact of education on wages. Recall that we needed the "correct specification" assumption in order to obtain unbiased OLS estimators. If this assumption is violated then it is very likely that the OLS estimator will be biased (lower or higher on average than what they should be). The type of distortion that will arise if we omit an important variable (e.g. ability) in the model depends on how the included variable(s) and the omitted variable are related.
5 / 23
Multiple Regression Analysis
Omitted Variable Bias Summary
Two cases where bias is equal to zero:
β2 = 0, that is x2 doesn’t really belong in model. x1 and x2 are uncorrelated in the sample.
6 / 23
Multiple Regression Analysis
Omitted Variable Bias
Example: An economist is interested in the determination of elementary school performance on a standardised exam.Suppose the performance is determined by avgscore = β0 + β1 expend + β2 povrate + u, where avgscore is the average score in a district, expend is expenditure per student in a district and povrate is the poverty rate in a district.Suppose that we only have observations on the performance of students and per student expenditures.We do not have information on poverty rates. Thus, the economist estimates avgscore = γ0 + γ1 expend + u What is the likely bias in OLS estimator?
1 / 23
Multiple Regression Analysis
Too Many or Too Few Variables
What happens if we include variables in our specification that don’t belong? What if we exclude a variable from our specification that does belong?