
mean ¼ 535 median ¼ 500 mode ¼ 500
sd ¼ 96 minimum ¼ 220 maximum ¼ 925
5th percentile ¼ 400 10th percentile ¼ 430
90th percentile ¼ 640 95th percentile ¼ 720
What can you conclude about the shape of a his-
togram of this data? Explain your reasoning.
79. The sample data x
1
, x
2
, ... , x
n
sometimes repre-
sents a time series, where x
t
¼ the observed value
of a response variable x at time t. Often the
observed series shows a great deal of random
variation, which makes it difficult to study
longer-term behavior. In such situations, it is
desirable to produce a smoothed version of the
series. One technique for doing so involves expo-
nential smoothing. The value of a smoothing
constant a is chosen (0 < a < 1). Then with x
t
¼ smoothed value at time t, we set
x
1
¼ x
1
, and
for t ¼ 2, 3, ... , n,
x
t
¼ ax
t
þ 1 aðÞ
x
t1
.
a. Consider the following time series in which
x
t
¼ temperature (
F) of effluent at a sewage
treatment plant on day t: 47, 54, 53, 50, 46, 46,
47, 50, 51, 50, 46, 52, 50, 50. Plot each x
t
against t on a two-dimensional coordinate sys-
tem (a time-series plot). Does there appear to
be any pattern?
b. Calculate the
x
t
’s using a ¼ .1. Repeat using
a ¼ .5. Which value of a gives a smoother
x
t
series?
c. Substitute
x
t1
¼ ax
t1
þ 1 aðÞ
x
t2
on the
right-hand side of the expression for
x
t
, then
substitute
x
t2
in terms of x
t2
and
x
t3
, and so
on. On how many of the values x
t
, x
t1
, ..., x
1
does
x
t
depend? What happens to the coeffi-
cient on x
tk
as k increases?
d. Refer to part (c). If t is large, how sensitive is
x
t
to the initialization
x
1
¼ x
1
? Explain.
[Note: A relevant reference is the article “Simple
Statistics for Interpreting Environmental Data,”
Water Pollution Contr. Fed. J., 1981: 167–175.]
80. Consider numerical observations x
1
, ... , x
n
.
It is frequently of interest to know whether the
x
t
’s are (at least approximately) symmetrically
distributed about some value. If n is at least
moderately large, the extent of symmetry can be
assessed from a stem-and-leaf display or histo-
gram. However, if n is not very large, such
pictures are not particularly informative. Consider
the following alternative. Let y
1
denote the smal-
lest x
i
, y
2
the second smallest x
i
, and so on.
Then plot the following pairs as points on a two-
dimensional coordinate system: (y
n
~
x,
~
x y
1
),
(y
n1
~
x,
~
x y
2
), (y
n2
~
x,
~
x y
3
), ... . There
are n/2 points when n is even and (n 1)/2 when
n is odd.
a. What does this plot look like when there is
perfect symmetry in the data? What does it
look like when observations stretch out more
above the median than below it (a long upper
tail)?
b. The accompanying data on rainfall (acre-feet)
from 26 seeded clouds is taken from the
article “A Bayesian Analysis of a Multiplica-
tive Treatment Effect in Weather Modi-
fication” (Technometrics, 1975: 161–166).
Construct the plot and comment on the extent
of symmetry or nature of departure from
symmetry.
4.1 7.7 17.5 31.4 32.7 40.6 92.4
115.3 118.3 119.0 129.6 198.6 200.7 242.5
255.0 274.7 274.7 302.8 334.1 430.0 489.1
703.4 978.0 1656.0 1697.8 2745.6
Bibliography
Chambers, John, William Cleveland, Beat Kleiner,
and Paul Tukey, Graphical Methods for Data Anal-
ysis, Brooks/Cole, Pacific Grove, CA, 1983.
A highly recommended presentation of both older
and more recent graphical and pictorial methodol-
ogy in statistics.
Freedman, David, Robert Pisani, and Roger Purves,
Statistics (4th ed.), Norton, New York, 2007. An
excellent, very nonmathematical survey of basic
statistical reasoning and methodology.
Hoaglin, David, Frederick Mosteller, and John Tukey,
Understanding Robust and Exploratory Data Anal-
ysis, Wiley, New York, 1983. Discusses why, as
well as how, exploratory methods should be
employed; it is good on details of stem-and-leaf
displays and boxplots.
Hoaglin, David and Paul Velleman, Applications,
Basics, and Computing of Exploratory Data Anal-
ysis, Duxbury Press, Boston, 1980. A good discus-
sion of some basic exploratory methods.
48
CHAPTER 1 Overview and Descriptive Statistics