If we interpret the estimated equation causally, it implies that an increase in the unem-
ployment rate lowers the crime rate. This is certainly not what we expect. The coefficient
on unem is not statistically significant at standard significance levels: at best, we have
found no link between crime and unemployment rates.
As we have emphasized throughout this text, this simple regression equation likely
suffers from omitted variable problems. One possible solution is to try to control for more
factors, such as age distribution, gender distribution, education levels, law enforcement
efforts, and so on, in a multiple regression analysis. But many factors might be hard to con-
trol for. In Chapter 9, we showed how including the crmrte from a previous year—in this
case, 1982—can help to control for the fact that different cities have historically different
crime rates. This is one way to use two years of data for estimating a causal effect.
An alternative way to use panel data is to view the unobserved factors affecting the
dependent variable as consisting of two types: those that are constant and those that vary
over time. Letting i denote the cross-sectional unit and t the time period, we can write a
model with a single observed explanatory variable as
y
it
0
0
d2
t
1
x
it
a
i
u
it
, t 1,2. (13.13)
In the notation y
it
, i denotes the person, firm, city, and so on, and t denotes the time period.
The variable d2
t
is a dummy variable that equals zero when t 1 and one when t 2;
it does not change across i,which is why it has no i subscript. Therefore, the intercept for
t 1 is
0
, and the intercept for t 2 is
0
0
. Just as in using independently pooled
cross sections, allowing the intercept to change over time is important in most applica-
tions. In the crime example, secular trends in the United States will cause crime rates in
all U.S. cities to change, perhaps markedly, over a five-year period.
The variable a
i
captures all unobserved, time-constant factors that affect y
it
. (The fact
that a
i
has no t subscript tells us that it does not change over time.) Generically, a
i
is called
an unobserved effect. It is also common in applied work to find a
i
referred to as a fixed
effect,which helps us to remember that a
i
is fixed over time. The model in (13.13) is
called an unobserved effects model or a fixed effects model. In applications, you might
see a
i
referred to as unobserved heterogeneity as well (or individual heterogeneity, firm
heterogeneity, city heterogeneity, and so on).
The error u
it
is often called the idiosyncratic error or time-varying error, because it
represents unobserved factors that change over time and affect y
it
. These are very much
like the errors in a straight time series regression equation.
A simple unobserved effects model for city crime rates for 1982 and 1987 is
crmrte
it
0
0
d87
t
1
unem
it
a
i
u
it
, (13.14)
where d87 is a dummy variable for 1987. Since i denotes different cities, we call a
i
an
unobserved city effect or a city fixed effect: it represents all factors affecting city crime
rates that do not change over time. Geographical features, such as the city’s location in
the United States, are included in a
i
. Many other factors may not be exactly constant, but
they might be roughly constant over a five-year period. These might include certain demo-
graphic features of the population (age, race, and education). Different cities may have
Chapter 13 Pooling Cross Sections across Time: Simple Panel Data Methods 461