Consider estimation of the slope in simple regression
assuming that is stochastic. The framework is the same as in the second (large sample) approach to stochastic regressors: the sample size goes to infinity. We suppose that
and there is an instrument for that satisfies the usual conditions:
(2) (IV existence condition)
(3) (IV consistency condition).
What we know about OLS and IV
|OLS estimator consistency condition||Consequences|
|Valid:||Both OLS and IV are consistent but OLS is more efficient by the Gauss-Markov theorem|
|Not valid: (endogeneity problem)||OLS is inconsistent and IV is consistent|
Two formulations of the null and alternative hypotheses
The next two formulations are based on Table 1.
Simple formulation. Null hypothesis: no endogeneity problem (OLS can be used; using IV is not advisable); alternative hypothesis: there is endogeneity problem (OLS cannot be used and IV can).
General formulation. We have two competing estimators: main estimator (think OLS) and alternative estimator (think IV). Null hypothesis: Both are consistent but is more efficient; alternative hypothesis: is inconsistent and is consistent.
The format of the test statistic requires knowledge of matrix algebra and is skipped; in statistical packages, you need only to find the p-value of the Durbin-Wu-Hausman statistic (it is distributed as chi-square). The second formulation allows for a more general interpretation of the Durbin-Wu-Hausman test by comparing an IV estimator with a smaller set of instruments to an IV estimator with a wider set of instruments.
The test is also called a Hausman specification test, because the endogeneity problem may be a consequence of a wrong model specification (the cause may be, for example, omission of relevant variables). If the null of no endogeneity is rejected, the researcher might want to modify the model, instead of using the IV estimator.