
64 2 Phenomenological Models
variance, which is often abbreviated as anova. As an example, suppose that we
want to investigate whether fungicides have an impact on the density of fungal
spores on plants. To answer this question, three experiments with fungicides
A, B, and C and a control experiment with no treatment are performed. Then,
these experiments involve what is called a factor x which has the factor levels
‘‘Fungicide A’’, ‘‘Fungicide B’’, ‘‘Fungicide C’’, and ‘‘No Fungicide’’. At each of
these factor levels, the experiment must be repeated a number of times such
that the expected values of the respective fungal spore densities are sufficiently
characterized.
Theresultsofsuchanexperimentcanbefoundinthefile
fungicide.csv
in the book software (see Appendix A). Note that the ‘‘Factor’’ column of this
file corresponds to x, while the ‘‘Value’’ column corresponds to y (it reports the
result of the measurement, i.e. the density of the fungal spores on the plants in
an appropriate unit that we do not need to discuss here). Let X
1
, X
2
, X
3
,andX
4
denote the random variables that have generated these data, and let μ
1
, μ
2
, μ
3
,and
μ
4
denote the expected values of these random variables. Then, we can set up a
hypothesis test as follows:
•
H
0
: μ
1
= μ
2
= μ
3
= μ
4
•
H
1
:Therearei, j ∈{1, 2, 3, 4} s.t. μ
i
= μ
j
•
α = 0.05
Basically, H
0
says that the factor x does not have any impact on the fungal spore
density y, while the alternative hypothesis H
1
is the negation of H
0
. An appropriate
p value for this test can now be computed using the R program
Anova.r in the
book software, which is based on R’s
anova command. If you run this program
as described in Appendix B, it will produce a few lines of text in which you read
‘‘Pr(>F) = 0.000376’’, which means p = 0.000376. Again, the test is significant
since we have p <α.Hence,H
0
can be rejected and we can say that the factor
‘‘fungicide’’ has a statistically significant impact on the fungal spore density (again,
at the 5% level).
Note that this analysis assumes random variables that are normally dis-
tributed and which have homogeneous variances (i.e. squared standard devi-
ations), see [19, 37] for more details. The above example is called a one-way
analysis of variance or single-factor analysis of variance since it involves one
factor x only. R’s
anova command and the Anova.r code can also be ap-
plied to situations with several factors x
1
, ..., x
n
, which is called a multiway
analysis of variance or multifactor analysis of variance. Note that when you per-
form a multiway analysis of variance using
Anova.r,youwillhavetousea
data file which provides one column for each of the factors, and one more
column for the measurement value. What we have described so far is also
known as the fixed-effects model of the analysis of variance. Within the gen-
eral scope of the analysis of variance, a great number of different modeling
approaches can be used, for example, random effects models which assume a
hierarchy of different populations whose differences are constrained by the hierar-
chy [39].