# Appendix D

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67

68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84

85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101

102 103 104 105 106 107 108 109 110 111 112 113 114 115

Statistical Issues in the Evaluation of

the Effects of Right-to-Carry Laws

Joel L. Horowitz

always missing. To estimate the law’s effect, one must have a way of “filling

in” the missing observation.

The discussion of this problem can be streamlined considerably by

using mathematical notation. Let i index locations (possibly counties) and t

index time periods (possibly years). Let Yit

denote the crime rate that

county i would have in year t with a right-to-carry law in effect. Let Yit

−

denote the crime rate that county i would have in year t without such a law.

Then the effect of the law on the crime rate is defined as it Yit−Yit−

under the assumption that all other factors affecting crime are the same

with or without the law. The fundamental measurement problem is that

one can observe either Yit

(if the law is in effect in county i and year t) or

Yit

−(if the law is not in effect in county i and year t) but not both. Therefore,

it can never be observed.

One possible solution to this problem consists of replacing the

unobservable it by the difference between the crime rates after and before

adoption of a right-to-carry law (in other words, carrying out a before-andafter

study). For example, suppose that county i (or county i’s state) adopts

a right-to-carry law in year s. Then one can observe Yit

−whenever t < s

and Yit

whenever t > s. Thus, one might consider measuring the effect of the

law by (for example)

Y Y i,si,s

−

−−1 1(the crime rate a year after adoption

minus the crime rate a year before adoption). However, this approach has

several serious difficulties.

First, factors that affect crime other than adoption of a right-to-carry

law may change between years s – 1 and s + 1. For example, economic

conditions, levels of police activity, or conditions in drug markets may

change. If this happens, then

Y Y i,si,s

−

−−1 1measures the combined effect of

all of the changes that took place, not the effect of the right-to-carry law

alone. Second,

Y Y i,si,s

−

−−1 1can give a misleading indication of the effect of

the law’s adoption even if no other relevant factors change. For example,

suppose that crime increases each year before the law’s adoption and decreases

at the same rate each year after adoption (Figure C-1). Then

Y Y i,si,s

−

−−1 1 0 , indicating no change in crime levels, even though the

trend in crime reversed in the year of adoption of the right-to-carry law.

Taking the difference between multiyear averages of crime levels after and

before adoption of the law would give a similarly misleading indication.

This has been pointed out by Lott (2000:135) in his response to Black and

Nagin (1998). As a third example, right-to-carry laws might be enacted in

response to crime waves that would peak and decrease even without the

laws. If this happens, then

Y Y i,si,s

−

−−1 1might reflect mainly the dynamics

of crime waves rather than the effects of right-to-carry laws.

Finally, the states that have right-to-carry laws in effect in a given year

may be systematically different from the states that do not have these laws

in effect. Indeed, Lott (2000:119) found that in his data, “states adopting

STATISTICAL ISSUES AND RIGHT-TO-CARRY LAWS 301

[right-to-carry] laws are relatively Republican with large National Rifle

Association memberships and low but rising rates of violent crime and

property crime.” Non-time-varying systematic differences among states are

accounted for by the fixed effects, i , in Models 6.1 and 6.2 in Chapter 6.

However, if there are time-varying factors that differ systematically among

states with and without right-to-carry laws and that influence the laws’

effects on crime, then the effects of enacting these laws in states that do not

have them cannot be predicted from the experience of states that do have

them, even if the other problems just described are not present.

The foregoing problems would not arise if the counties that have rightto-

carry laws could be selected randomly. Of course, this is not possible,

but consideration of the hypothetical situation in which it is possible provides

insight into the methods that are used to estimate the effects of realworld

right-to-carry laws. If the counties that have right-to-carry laws in

year t are selected randomly, then there can be no systematic differences

between counties with and without these laws in year t. Consequently, the

average value of Yit

is the same across counties in year t regardless of

whether a right-to-carry law is in effect. Similarly, the average value of Yit

−

is the same across counties. It follows that the average effect on crime of

the right-to-carry law is the average value of Yit

in counties with the law

FIGURE D-1 Hypothetical crime rates by year.

NOTE: An increasing trend reverses in year 5, but the crime rate is the same in

years 4 and 6. The average crime rate over years 5-9 is the same as it is over years

1-5.

Crime Rate

Year

0 5 10

.5

1

1.5

2

2.5

minus the average value of Yit

−in counties that do not have the law. In

other words, the average effect is the average value of the observed crime

rate in counties with the law minus the average value of the observed

crime rate in counties that do not have the law.1

In the real world, the counties that have right-to-carry laws cannot be

selected randomly, but one might hope that the benefits of randomization

can be achieved by “controlling” the variables that are responsible for

“relevant” systematic differences between counties that do and do not have

right-to-carry laws. Specifically, suppose that the relevant variables are

denoted by X . Suppose further that the average value of Yit

is the same

across counties that have the same value of X , regardless of whether a

right-to-carry law is in effect. Similarly, suppose that the average value

of Yit

−is the same across counties that have the same value of X . If these

conditions are satisfied, then the average effect on crime of adoption of a

right-to-carry law in counties with a specified value of X is the average of

the observed crime rates in counties with the specified value of X that have

the law in place minus the average of the observed crime rates in counties

with the specified value of X that do not have the law. This is the idea on

which all of the models of Lott and his critics are based.

The problem with this idea is that the variables that should be included

in X are unknown, and it is not possible to carry out an empirical test of

whether a proposed set of X variables is the correct one. This is because the

answer to the question whether X is a proper set of control variables depends

on the relation of X to the unobservable counterfactual outcomes

( Yit

in counties that do not have right-to-carry laws in year t and Yit

−in

counties that do have the laws in year t). Thus, it is largely a matter of

opinion which set to use. A set that seems credible to one investigator may

lack credibility to another. This problem is the source of the disagreement

between Lott and his critics over Lott’s use of the arrest rate as an explanatory

variable in his models. It is also the source of other claims that Lott

may not have accounted for all relevant influences on crime. See, for example,

Ayers and Donohue (1999:464-465) and Lott’s response (Lott,

2000:213-215).2

1This conclusion—but with measures of health status in place of crime rates—forms the

justification for using randomized clinical trials to evaluate new drugs, medical devices, and

medical procedures.

2Lott and his critics use panel data in which each county is observed in each of many years.

Panel data provide a form of “automatic” control over unobserved factors that differ among

counties but are constant within each county over time. There can, however, be no assurance

that all unobserved factors that are relevant to the effectiveness of right-to-carry laws are

constant over time within counties. Nor is there any assurance that the models used by Lott

and his critics correctly represent the effects of such factors.

STATISTICAL ISSUES AND RIGHT-TO-CARRY LAWS 303

Lott is aware of this problem. In response, he argues that his study used

“the most comprehensive set of control variables yet used in a study of

crime, let alone any previous study on gun control” (Lott, 2000:153). There

are two problems with this argument. First, although it is true that Lott uses

a large set of control variables (his data contain over 100 variables, though

not all are used in each of his models), he is limited by the availability of

data. There is (and can be) no assurance that his data contain all relevant

variables. Second, it is possible to control for too many variables. Specifically,

suppose that there are two sets of potential explanatory variables, X

and Z . Then it is possible for the average value of Yit

to be the same among

counties with the same value of X , regardless of whether a right-to-carry

law is in place, whereas the average value of Yit

among counties with the

same values of X and Z depends on whether a right-to-carry law has been

adopted. The same possibility applies to Yit

−. In summary, it is not enough

to use a very large set of control or explanatory variables. Rather, one must

use a set that consists of just the right variables and, in general, no extra

ones.3

In fact, there is evidence of uncontrolled (or, possibly, overcontrolled)

systematic differences among counties with and without right-to-carry laws

in effect. Donohue (2002: Tables 5-6) estimated models in which future

adoption of a right-to-carry law is used as an explanatory variable of crime

levels prior to the law’s adoption. He found a statistically significant relation

between crime levels and future adoption of a right-to-carry law, even

after controlling for what he calls “an array of explanatory variables.” This

result implies that there are systematic differences between adopting and

nonadopting states that are not accounted for by the explanatory variables

In other words, there are variables that affect crime rates but are not in the

model, and it is possible that the omitted variables are the causes of any

apparent effects of adoption of right-to-carry laws.4

3Bronars and Lott (1998) and Lott (2000) have attempted to control for confounding

variables by comparing changes in crime rates in neighboring counties such that some counties

are in a state that adopted a right-to-carry law and others are in a state that did not adopt

the law. Bronars and Lott (1998) and Lott (2000) found that crime rates tend to decrease in

counties where the law was adopted and increase in neighboring counties where the law was

not adopted. The issues raised by this finding (and by any conclusion that differential changes

in crime levels in neighboring counties are caused by adoption or nonadoption of right-tocarry

laws) are identical to the issues raised by the results of Lott’s main models, Models 6.1

and 6.2 in Chapter 6.

4If the explanatory variables accounted for all systematic differences in crime rates, then the

average crime rate conditional on the explanatory variables would be independent of the

adoption variable. Thus, future adoption of a right-to-carry law would not have any explanatory

power.

Lott and Mustard (1997, Table 11) and Lott (2000:118) attempted to control for omitted

variables affecting crime by carrying out a procedure called “two-stage least squares” (2SLS).

There is also evidence that estimates of the effects of these laws are

sensitive to the choice of explanatory variables. See, for example, the discussion

of Table 6-5 in Chapter 6. Thus, the choice of explanatory variables

matters. As has already been explained, there is and can be no empirical test

for whether a proposed set of explanatory variables is correct. There is little

prospect for achieving an empirically supportable agreement on the right

set of variables. For this reason, in addition to the goodness-of-fit problems

that are discussed next, it is unlikely that there can be an empirically based

resolution of the question of whether Lott has reached the correct conclusions

about the effects of right-to-carry laws on crime.5

ESTIMATING THE RELATION AMONG CRIME RATES,

THE EXPLANATORY VARIABLES, AND ADOPTION OF

RIGHT-TO-CARRY LAWS

This section discusses the problem of estimating the average crime rate

in counties that have the same values of a set of explanatory variables X

and that have (or do not have) right-to-carry laws in effect. Specifically,

let Zit 1if county i has a right-to-carry law in effect in year t, and

let Zit 0 if county i does not have such a law in year t. Let Yit denote the

crime rate (or its logarithm) in county i and year t, regardless of whether a

right-to-carry law is in effect. The objective in this section is to estimate the

average values of Yit conditional on Zit 1and Yit conditional on Zit 0 for

counties in which the explanatory variables X have the same values, say

X X0 . Denote these averages by E(Y | Z , X ) it it 1 0 and E(Y | Z , X ) it it 0 0 ,

respectively. E(Y | Z , X ) it it 1 0 is the average crime rate in year t in counties

that have right-to-carry laws and whose explanatory variables have the values

However, the 2SLS estimates of the effects of right-to-carry laws on the incidence of violent

crimes differ by factors of 15 to 42, depending on the crime, from the estimates in Lott’s

Table 4.1 and are implausibly large. For example, according to the 2SLS estimates reported

by Lott and Mustard (1997, Table 11), adoption of right-to-carry laws reduces all violent

crimes by 72 percent, murders by 67 percent, and aggravated assaults by 73 percent. 2SLS

works by using explanatory variables called instruments to control the effects of any missing

variables. A valid instrument must be correlated with the variable indicating the presence or

absence of a right-to-carry law but otherwise unrelated to fluctuations in crime that are not

explained by the covariates of the model. In Lott and Mustard (1997) and Lott (2000), the

instruments include levels and changes in levels of crime rates and are, by definition, correlated

with the dependent variables of the models. Thus, they are unlikely to be valid instruments.

It is likely, therefore, that Lott’s and Mustard’s 2SLS estimates are artifacts of the use

of invalid instruments and other forms of specification errors.

5The problem of not knowing the correct set of explanatory variables is pervasive in evaluation

of the effects of public policy measures. The sensitivity of estimated results to the choice

of variables and the inability to resolve controversies over which variables should be used has

led to the use of randomized experiments to evaluate social programs, such as job training

and income maintenance.

STATISTICAL ISSUES AND RIGHT-TO-CARRY LAWS 305

X0 . E(Y | Z , X ) it it 0 0 is the average crime rate in year t in counties that do

not have right-to-carry laws and whose explanatory variables have the values

X0 . If the explanatory variables control for all other factors that are relevant to

the crime rate, then D X Y Z X Y Z X t it it it it ( ) ( | , ) ( | , ) 0 0 0 E 1 −E 0 is the

average change in the crime rate caused by the law in year t in counties

where the values of the explanatory variables are X0 .

The models of Lott and his critics are all aimed at estimating D X t( ) 0

for some set of explanatory variables X . This section discusses the statistical

issues that are involved in estimating D X t( ) 0 . The discussion focuses

on the problem of estimating the function Dt for a given set of explanatory

variables. This issue is distinct from and independent of the problem of

choosing the explanatory variables that was discussed in the previous section.

Thus, the discussion in this section does not depend on whether there

is agreement on a “correct” set of explanatory variables.

Estimating D X t( ) 0 is relatively simple if in year t there are many counties

with right-to-carry laws and the same values X0 of the explanatory

variables and many counties without right-to-carry laws and identical values

X0 of the explanatory variables. D X t( ) 0 would then be the average of

the observed crime rate in the counties that do have right-to-carry laws

minus the average crime rate in counties that do not have such laws. However,

there are not many counties with the same values of the explanatory

variables. Indeed, in the data used by Lott and his critics, each county has

unique values of the explanatory variables. Therefore, the simple averaging

procedure cannot be used. Instead, D X t( ) 0 must be inferred from observations

of crime rates among counties with a range of values of X . In other

words, it is necessary to estimate the relation between average crime rates

and the values of the explanatory variables.

In principle, the relations between average crime rates and the explanatory

variables with and without a right-to-carry law in effect can be

estimated without making any assumptions about their shapes. This is

called nonparametric estimation. Härdle (1990) provides a detailed discussion

of nonparametric estimation methods. Nonparametric estimation is

highly flexible and largely eliminates the possibility that the estimated

model may not fit the data, but it has the serious drawback that the size of

the data set needed to obtain estimates that are sufficiently precise to be

useful increases very rapidly as the number of explanatory variables increases.

This is called the curse of dimensionality. Because of it, nonparametric

estimation is a practical option only in situations in which there are

few explanatory variables. It is not a practical option in situations like

estimation of the effects of right-to-carry laws, where there can be 50 or

more explanatory variables.

Because of the problems posed by the curse of dimensionality, the most

frequently used methods for estimation with a large number of explanatory

variables assume that the relation to be estimated belongs to a relatively

small class of “shapes.”6 For example, Models 6.1 and 6.2 assume that the

average of the logarithm of the crime rate is a linear function of the variables

comprising X . Lott and his critics all restrict the shapes of the relations

they estimate. Doing this greatly increases estimation precision, but it

creates the possibility that the true relation of interest does not have the

assumed shape. That is, the estimated model may not fit the data. This is

called misspecification. Moreover, because the set of possible shapes increases

as the number of variables in X increases, the opportunities for

misspecification also increase. This is another form of the curse of dimensionality.

Its practical consequence is that one should not be surprised if a

simple class of models (or shapes) such as linear models fails to fit the data.

Lack of fit is a serious concern because it can cause estimation results to

be seriously misleading. An example based on an article that was published

in the National Review (Tucker 1987) illustrates this problem. The example

consists of estimating the relation between the fraction of a city’s

population who are homeless, the vacancy rate in the city, an indicator of

whether the city has rent control, and several other explanatory variables.

Two models are estimated:

(D.1) FRAC 0 1RENT 2VAC X

and

(D.2) FRAC RENT VAC X 0 1 2 1( / ) ,

where FRAC denotes the number of homeless per 1,000 population in a

city, RENT is an indicator of whether a city has rent control ( RENT 1if a

city has rent control and RENT 0 otherwise), VAC denotes the vacancy

rate, and X denotes the other explanatory variables. The data are taken

from Tucker (1987). The estimation results are summarized in Table D-1.

According to Model D.1, there is a statistically significant relation

between the fraction of homeless and the indicator of rent control (p <

0.05) but not between homelessness and the vacancy rate (p > 0.10). Moreover,

according to Model D.1, the fraction of homeless is higher in cities

that have rent control than it is in cities that do not have rent control. This

6More precisely, the problem is to estimate a conditional mean function (e.g., the mean of

the logarithm of the crime rate conditional on the explanatory variables and the indicator of

whether a right-to-carry law is in effect). Nonparametric estimation places no restrictions on

the specification or “shape” of this function but suffers from the curse of dimensionality. The

estimation methods in common use, including those used by Lott and his critics, assume that

the conditional mean function belongs to a relatively small class of functions, such as linear

functions of the variables or functions that are linear in the original variables and products of

pairs of the original variables.

STATISTICAL ISSUES AND RIGHT-TO-CARRY LAWS 307

result is consistent with the hypothesis that rent control is a cause of

homelessness (possibly because it creates a shortage of rental units) and that

the vacancy rate is unrelated to homelessness. However, Model D.2 gives

the opposite conclusion. According to this model, there is a statistically

significant relation between the fraction of homeless and the vacancy rate

(p < 0.05) but not between homelessness and rent control (p > 0.10).

Moreover, according to Model D.2, the fraction of homeless decreases as

the vacancy rate increases. Thus, the results of estimation in Model D.2 are

consistent with the hypothesis that a low vacancy rate contributes to

homelessness but rent control does not. In other words, Model D.1 and

Model D.2 yield opposite conclusions about the effects of rent control and

the vacancy rate on homelessness. In addition, it is not possible for both of

the models to fit the data, although it is possible for neither to fit. Therefore,

misspecification or lack of fit is causing at least one of the models to

give a misleading indication of the effect of rent control and the vacancy

rate on homelessness.

It is possible to carry out statistical tests for lack of fit. None of the

models examined by the committee passes a simple specification test called

RESET (Ramsey, 1969). That is, none of the models fits the data. This

raises the question whether a model that fits the data can be found. For

example, by estimating and testing a large number of models, it might be

possible to find one that passes the RESET test. This is called a specification

search. However, a specification search cannot circumvent the curse of

dimensionality. If the search is carried out informally (that is, without a

statistically valid search procedure and stopping rule), as is usually the case

in applications, then it invalidates the statistical theory on which estimation

and inference are based. The results of the search may be misleading, but

because the relevant statistical theory no longer applies, it is not possible to

test for a misleading result. Alternatively, one can carry out a statistically

valid search that is guaranteed to find the correct model in a sufficiently

large sample. However, this is a form of nonparametric regression, and

therefore it suffers the lack of precision that is an unavoidable consequence

of the curse of dimensionality. Therefore, there is little likelihood of identi-

TABLE D-1 Results of Estimating a Model of the Fraction of Homeless

in a City (quantities in parentheses are standard errors)

Model Coefficient of RENT Coefficient of VAC or 1/VAC

(D.1) 3.17 –0.26

(1.51) (0.16)

(D.2) –1.65 18.89

(3.11) (8.15)

fying a well-fitting model with existing data and statistical methods.7 In

summary, the problems posed by high-dimensional estimation, misspecified

models, and lack of knowledge of the correct set of explanatory variables

seem insurmountable with observational data.