z-Scores

Our final measure of position, the z-score (also known as the standard score or the standardized score), incorporates both the mean and the standard deviation. The z-score measures directed distance in standard deviations from the mean. In symbols:

$${z=(x-\mu)/\sigma}$$

where \(x\) is the variable's value, \(\mu\) ("mu") is the mean and \(\sigma\) ("sigma") is the standard deviation. Basically, \(x-\mu\) is the directed distance from the mean and sets \(z=0\) for the mean. Dividing through by \(\sigma\) "normalizes" the measure (i.e., makes it scale-invariant).

It's useful to rewrite this equation to solve for x:

$${x=\mu+z\sigma}$$

Written this way, we can see that the raw score x is the mean plus z "sigmas".

Example: IQ scores

One common way to measure intelligence is using IQ (intelligence quotient). The mean IQ is 100 and the standard deviation is 15.

Professor Tim Roberts of Australia, with an IQ of 178, is one of the smartest people on Earth. To compute the z-score for his IQ, we use the formula above:

$${z=(178-100)/15=5.2}$$

Roberts' IQ is 5.2 sigmas greater than the mean. He's one smart cookie!

Strengths of the z-score

Easy to understand. Though perhaps not quite as easy to understand as simple ranking, the z-score is still pretty easy to understand if you know what the mean and standard deviation are.

Scale-invariant. z-scores allow us to ignore the scale of the dataset. 0.1 sigmas is close to the mean no matter what the actual numerical data values are. 5.8 is far from the mean again no matter what the numerical data values are. This means we can use it to compare positions across different datasets.

Applies to any numerical dataset. It doesn't matter how the data are distributed—the z-score is still meaningful.

Commonly used. Nothing else to say about that.

Weaknesses of the z-score

Loses the meaning of the raw scores. This is the flip side of scale invariance: we can't use the z-score alone to talk about positions in absolute terms. We need to know the actual mean and standard deviation of the dataset too.

Exercises

Exercise 1. Housing prices in a certain city have a mean of $325,300 with a standard deviation of $42,600. What's the z-score for a house whose price is $282,700?

Exercise 2. For the dataset in Exercise 1, what would be the price of a house with a z-score of -0.65?