
624 Chapter 15
Σ(x - x)
2
= (1.9 - 4.3375)
2
+ (2.5 - 4.3375)
2
+ (3.2 - 4.3375)
2
+ (3.8 - 4.3375)
2
+ (4.7 - 4.3375)
2
+ (5.5 - 4.3375)
2
+
(5.9 - 4.3375)
2
+ (7.2 - 4.3375)
2
= (-2.4375)
2
+ (-1.8375)
2
+ (-1.1375)
2
+ (-0.5375)
2
+ (0.3625)
2
+ (1.1625)
2
+ (1.5625)
2
+ (2.8625)
2
= 23.02 (to 2 decimal places)
We find the value of b by dividing Σ(x - x)(y - y) by Σ(x - x)
2
. This gives us
b = 122.53/23.02
= 5.32
Finding the slope for the line of best fit, part ii
Here’s a reminder of the data for concert attendance and predicted hours of sunshine:
We’re part of the way through calculating the value of b, where y = a + bx.
We’ve found that x = 4.3375, y = 38.875, and Σ(x - x)(y - y) = 122.53. The
final thing we have left to find is Σ(x - x)
2
. Let’s give it a go
x (sunshine)
1.9 2.5 3.2 3.8 4.7 5.5 5.9 7.2
y (attendance)
22 33 30 42 38 49 42 55
We find Σ(x - x)
2
using the
x values. It’s a bit like finding
the variance of a sample, but
without dividing by n-1.
In other words, the line of best fit for the data is y = a + 5.32x. But what’s a?
Q:
It looks like the formulas you’ve
given are for samples rather than
populations. Is that right?
A: That’s right. We’ve used samples
rather than populations because the data
we’ve been given is a sample. There’s
nothing to stop you using a population if you
have the data, just use μ instead of x.
Q:
Is the value of b always positive?
A: No, it isn’t. Whether b is positive
or negative actually depends on the type
of linear correlation. For positive linear
correlation, b is positive. For negative linear
correlation, b is negative.
Q:
I’ve heard of the term gradient.
What’s that?
A: Gradient is another term for the slope
of the line, b.
Q:
What about if there’s no
correlation? Can I still work out b?
A: If there’s no correlation, you can still
technically find a line of best fit, but it won’t
be an effective model of the data, and you
won’t be able to make accurate predictions
using it.
Q:
Is there an easy way of calculating
b?
A: Calculating b is tricky if you have lots
of observations, but you can get software
packages to calculate this for you.
b = Σ(x - x)(y - y)
Σ(x - x)
2
Here’s a reminder
of the formula.
We’ve found b. This gives the slope
for the line of best fit.
calculating b for the line of best fit, part deux
(x - x)
2
Note, we don’t use y
or y in this part of
the equation.