
716
PART IV
✦
Cross Sections, Panel Data, and Microeconometrics
17.4 BINARY CHOICE MODELS FOR PANEL DATA
Qualitative response models have been a growth industry in econometrics. The recent
literature, particularly in the area of panel data analysis, has produced a number of new
techniques. The availability of high-quality panel data sets on microeconomic behavior
has maintained an interest in extending the models of Chapter 11 to binary (and other
discrete choice) models. In this section, we will survey a few results from this rapidly
growing literature.
The structural model for a possibly unbalanced panel of data would be written
y
∗
it
= x
it
β + ε
it
, i = 1,...,n, t = 1,...,T
i
,
y
it
= 1ify
∗
it
> 0, and 0 otherwise, (17-38)
The second line of this definition is often written
y
it
= 1(x
it
β + ε
it
> 0)
to indicate a variable that equals one when the condition in parentheses is true and
zero when it is not. Ideally, we would like to specify that ε
it
and ε
is
are freely corre-
lated within a group, but uncorrelated across groups. But doing so will involve
computing joint probabilities from a T
i
variate distribution, which is generally prob-
lematic.
27
(We will return to this issue later.) A more promising approach is an effects
model,
y
∗
it
= x
it
β + v
it
+ u
i
, i = 1,...,n, t = 1,...,T
i
,
y
it
= 1ify
∗
it
> 0, and 0 otherwise, (17-39)
where, as before (see Sections 11.4 and 11.5), u
i
is the unobserved, individual spe-
cific heterogeneity. Once again, we distinguish between “random” and “fixed” effects
models by the relationship between u
i
and x
it
. The assumption that u
i
is unrelated
to x
it
, so that the conditional distribution f (u
i
|x
it
) is not dependent on x
it
, produces
the random effects model. Note that this places a restriction on the distribution of the
heterogeneity.
If that distribution is unrestricted, so that u
i
and x
it
may be correlated, then we have
what is called the fixed effects model. The distinction does not relate to any intrinsic
characteristic of the effect itself.
As we shall see shortly, this is a modeling framework that is fraught with difficulties
and unconventional estimation problems. Among them are the following: Estimation
of the random effects model requires very strong assumptions about the heterogeneity;
27
A “limited information” approach based on the GMM estimation method has been suggested by Avery,
Hansen, and Hotz (1983). With recent advances in simulation-based computation of multinormal integrals
(see Section 15.6.2.b), some work on such a panel data estimator has appeared in the literature. See, for
example, Geweke, Keane, and Runkle (1994, 1997). The GEE estimator of Diggle, Liang, and Zeger (1994)
[see also, Liang and Zeger (1986) and Stata (2006)] seems to be another possibility. However, in all these
cases, it must be remembered that the procedure specifies estimation of a correlation matrix for a T
i
vector
of unobserved variables based on a dependent variable that takes only two values. We should not be too
optimistic about this if T
i
is even moderately large.