THE LAW: A NON-QUANTITATIVE ANALOGUE 91
disease is denoted by S
c
). The terminology introduced in the previous section is that the
probability P(T |S) (the test is positive if the subject has the disease) is the test sensitivity,
whereas the probability P(T
c
|S
c
) (the test is negative in a subject without the disease) is the
test specificity.
Specificity and sensitivity may be interesting properties of the test, but what matters to
the individual is how likely it is that he has the disease, when the test is positive (in fact, also
if the test is negative). This probability, P(S|T ), is called the predictive value of the test. The
test does good for the subject if it correctly predicts that he has the disease (provided that this
information is useful to have). The test is bad for the subject when it wrongly says that he has
the disease. In that case he or she may undergo some risky surgery or treatment unnecessarily.
From the perspective of the individual, a large predictive value is the key ingredient of a
screening test.
However, the predictive value of a screening test is not computable from knowledge of the
sensitivity and specificity alone. It also requires knowledge about the disease prevalence; the
proportion in the population with the disease. Let us continue the example with the serum GT
test using a cut-off limit of 2 .1 μkat/L as a screening test for alcoholism. Recall that the test had
a sensitivity of 0.45, meaning that 45% of all alcoholics have a positive test, and a specificity
of 0.90, meaning that only 10% of non-alcoholics end up with a positive test. Now assume that
we have 10% alcoholics in the population. To compute the predictive value, consider a sample
of 1000 people from the population. We expect 100 of them to be alcoholics, contributing 45
positive tests. At the same time we have 900 non-alcoholics, contributing 90 positive tests. It
follows that only one third of the positive tests come from the alcoholics. From the perspective
of the screened individual, there is a fair risk that the wrong conclusion is drawn.
Under the same assumption, we can compute the probability that the subject is not an
alcoholic if the test is negative. Arguing as above, our 1000 subjects produce 55 negative tests
among the alcoholics and 0.9 · 900 = 810 among the non-alcoholics, so the likelihood that a
negative test comes from someone who is not alcoholic is 810/(810 + 55) = 0.94. In other
words, among the people with a negative test we find only 6% alcoholics, those who were
missed by the test.
For these reasons a test used in a screening program, especially for a disease with low
incidence, must have good specificity in addition to acceptable sensitivity. But even that may
not be sufficient. Suppose that we have a test for a particular drug, say a narcotic, that has
both a specificity and a sensitivity of 99%. If a big company starts to test all its employees
for this drug, a drug that is actually used by only 1% of them, we see that for a subject with
a positive test, there is only a 50–50 chance he actually is a drug user. Is that an acceptable
false-positive rate? The calculations performed above are formalized in mathematical terms
in what is called Bayes’ theorem (see Box 4.2), and which forms the foundation of Bayesian
statistics (to be discussed later in this chapter).
4.4 The law: a non-quantitative analogue
What is the counterpart of all this for the legal problem? In the law setting the test diagnostic is
some measure of ‘appearance of guilt’. If this could be quantified, it would have a distribution
in the population. Innocent people may appear guilty to some degree; they may just happen
to be in the wrong place at the wrong time, or they can have a history of similar crimes. The
amount of guilt appearance therefore varies between individuals and defines a distribution