Binary choice models: theoretical obstacles (problems with the linear probability model and binary choice models)

### What's wrong with the linear probability model

Recall the problem statement: the dependent variable can take only two values, 0 or 1, and the independent variables are joined into the index . The linear probability model

(1)

is equivalently written in linear regression form as

(2) with .

Let's study the error term. If , from (2) the value of is . From (1) we know the probability of this event. If , then the value of is and by (1) the probability of this event is . We can summarize this information in a table:

Values of | Corresponding probabilities |

For each observation, *the error is a binary variable*. In particular, it's not continuous, much less normal. Since the index changes with the observation, *the errors are not identically distributed*.

It's easy to find the mean and variance of . The mean is

(this is good). The variance is

which is bad (*heteroscedasticity*). Besides, for this variance to be positive, the index should stay between 0 and 1.

### Why in binary choice models there is no error term

We know the general specification of a binary choice model:

Here is a distribution function of some variable, say . Let's see what happens if we include the error term, as in

(3)

It is natural, as a first approximation, to consider identically distributed errors. By definition,

(4) .

The variables are distributed identically. Denoting their common distribution, from (3) and (4) we have

.

Thus, including the error term in (3) leads to a change of a distribution function in the model specification. In probit and logit, we fix good distribution functions from the very beginning and don't want to change them by introducing (possibly bad) errors.

[…] index, it has to be made up for by the error term. Thus the error term will be certainly bad. (A detailed analysis shows that it will be heteroscedastic but this fact is less important than the problem with range […]