
122 Chapter 3
The variance and standard deviation measure
how values are dispersed by looking at how far
values are from the mean.
The variance is calculated using
An alternate form is
The standard deviation is equal to the square root
of the variance, and the variance is the standard
deviation squared.
Standard scores, or z-scores, are a way of
comparing values across different sets of data
where the means and standard deviations are
different. To find the standard score of a value x,
use:
x
2
-
2
n
Q:
So variance and standard deviation both measure the
spread of your data. How are they different from the range?
A: The range is quite a simplistic measure of the spread of your
data. It tells you the difference between the highest and lowest
values, but that’s it. You have no way of knowing how the data is
clustered within it.
The variance and standard deviation are a much better way
of measuring the variability of your data and how your data is
dispersed, as they take into account how the data is clustered.
They look at how far values typically are from the center of
your data.
Q:
And what’s the difference between variance and
standard deviation? Which one should I use?
A: The standard deviation is the square root of the variance,
which means you can find one from the other.
The standard deviation is probably the most intuitive, as it tells you
roughly how far your values are, on average, from the mean.
Q:
How do standard scores fit into all this?
A: Standard scores use the mean and standard deviation to
convert values in a data set to a more generic distribution, while at
the same time, making sure your data keeps the same basic shape.
They’re a way of comparing different values across different data
sets even when the data sets have different means and standard
deviations. They’re a way of measuring relative standing.
Q:
Do standard scores have anything to do with detecting
outliers?
A:
Good question! Determining outliers can be subjective, but
sometimes outliers are defined as being more than 3 standard
deviations of the mean. Statisticians have different opinions about
this though, so be warned.
no dumb questions
(x
- )
2
n
x
-
z =