Simple regression: a useful comparison of what we have before and after estimation
Initially, we have only observations
Then we assume dependence between y's and x's of the form
Here are unknown parameters to be estimated and are random errors which satisfy the basic assumption
It is convenient to call a linear part of model (2).
The OLS estimators of are, respectively,
Using these estimators, we define the fitted value which mimics the linear part. To mimic the errors, we define residuals . These definitions give a sample analog of (2):
The residuals also possess the property
which is a sample analog of (3).
|Before estimation||After estimation|
|are unknown, the errors are unobservable||The estimators, fitted values and residuals are known functions of observed values (1)|
|(2) is just a product of our imagination||Its analog (2') holds by construction|
|Whether (3') is true or not we don't know||Its analog (3') is always true|
Tricky question. Put . Ask your students to show that if are deterministic, then under condition (3) along with (3') one has
This will reveal if they know the difference between sample means and population means.