The variance of a set of values is usually expressed in terms of squared differences between those values and the mean of those values.

However the sum of squared differences between the values and the mean can also be expressed in term of the sum of squared pairwise differences among the values themselves, without reference to the mean .

In particular, we want to show that

.

To get an expression involving we rewrite the squared difference in the righthand sum and then expand the result:

Since the squared difference in the first term does not depend on , the first term can be rewritten as

Since the squared difference in the third term does not depend on , the third term can be rewritten as

where in the last step we replaced as an index with . So the third term is identical to the first term.

We now turn to the second term, . We can bring the difference out of the inner sum, since it does not depend on the index . This gives us

The sum can then be rewritten as

But we have by definition, so we then have

We can then substitute this result into the second term as follows:

Now that we know the value of all three terms we have

so that

which is what we set out to prove.

However, we can further simplify this identity. Since when and , we can consider only differences when (i.e., elements above the diagonal, if we consider the pairwise comparisons to form a matrix):

From the definition of we then have

### Like this:

Like Loading...

*Related*