9.6. Confidence Tests 551
9.5.D Presentation of Results
Now that we know how to calculate errors associated with parameters by using errors
in individual measurements, we should discuss how to present our final results. We
saw earlier that there are basically two classes of errors: systematic and random.
Though it is a common practice to combine both errors together in the final result
but a better approach, as adopted by many careful experimenters, is to explicitly
state them separately. For example, the result of an experiment might be represented
at 1σ confidence as
ξ = 205.43 ±6.13
syst
± 14.36
rand
,
where the superscripts syst and rand stand for systematic and random errors respec-
tively.
A word of caution here. By looking at the above numbers, one might naively
conclude that all the values would lie between 205−6.13−14.36 and 205+6.13+14.36.
This is not really true. Earlier in the chapter we discussed the confidence intervals
and we saw that, for normally distributed data, a 1σ uncertainty guarantees with
only about 68% confidence that the result lies within the given values (that is,
between
¯
ξ −σ and
¯
ξ + σ). For higher confidence, one must increase the σ-level.For
example, for a 99% confidence, the above result will have to be written as
ξ = 205.43 ±6.13
syst
± 43.08
rand
,
where we have multiplied the random error of 1σ by a factor of 3. Note that, since
systematic uncertainty does not depend on statistical fluctuations, there is no need
to multiply it by any factor. Now we can say with 99% confidence that the value of
the parameter lies between 205 − 6.13 − 43.08 and 205 + 6.13 + 43.08.
9.6 Confidence Tests
Computing different quantities from a data set obtained from an experiment is
helpful in understanding the characteristics of the system but if we have a certain
bias about the behavior of the system we might also want to judge the data against
our hypothesis. This judgment can be qualitative, such as just a visual sense of how
the data looks like with respect to the expectation, or quantitative, which is the
subject of the discussion here.
To judge a data sample quantitatively against a hypothesis we perform the so
called confidence or goodness-of-fit test. For this we first define a goodness-of-fit
statistic by taking into account both the data and the hypothesis. The idea is
to have a quantity whose probability of occurrence could tell us about the level of
agreement between the data and the hypothesis. Of course the choice of this statistic
is arbitrary but several standard functions have been generated that can be applied
in most of the cases. Before we look at some of these functions, let us first see how
the general procedure works.
Let us represent the goodness-of-fit statistic by t such that its large values corre-
spond to poor agreement with the hypothesis h. Then the p.d.f g(t|h)canbeused
to determine the probability p of finding t in a region starting from the experimen-
tally obtained value t
0
up to the maximum. This is equivalent to evaluating the