# z-Scores

Our final measure of position, the *z-score* (also known as the
*standard score* or the *standardized score*), incorporates both the
mean and the standard deviation.
The `z`-score measures directed distance in standard deviations from the mean. In symbols:

$${z=(x-\mu)/\sigma}$$

where \(x\) is the variable's value, \(\mu\) ("mu") is the mean and \(\sigma\) ("sigma") is the standard deviation. Basically, \(x-\mu\) is the directed distance from the mean and sets \(z=0\) for the mean. Dividing through by \(\sigma\) "normalizes" the measure (i.e., makes it scale-invariant).

It's useful to rewrite this equation to solve for `x`:

$${x=\mu+z\sigma}$$

Written this way, we can see that the raw score `x` is the mean plus `z` "sigmas".

## Example: IQ scores

One common way to measure intelligence is using IQ (intelligence quotient). The mean IQ is 100 and the standard deviation is 15.

Professor Tim Roberts of Australia, with an IQ of 178, is one of the smartest people on Earth. To compute the z-score for his IQ, we use the formula above:

$${z=(178-100)/15=5.2}$$

Roberts' IQ is 5.2 sigmas greater than the mean. He's one smart cookie!

## Strengths of the z-score

**Easy to understand.** Though perhaps not
*quite* as easy to understand as simple ranking, the `z`-score is still pretty easy to
understand if you know what the mean and
standard deviation are.

**Scale-invariant.** `z`-scores allow us
to ignore the scale of the dataset. 0.1 sigmas is close to the mean no matter what the actual numerical data
values are. 5.8 is far from the mean again no matter what the numerical data values are. This means we can use
it to compare positions across different datasets.

**Applies to any numerical dataset.** It doesn't
matter how the data are distributed—the `z`-score is still meaningful.

**Commonly used.** Nothing else to say about
that.

## Weaknesses of the z-score

**Loses the meaning of the raw scores.** This is
the flip side of scale invariance: we can't use the `z`-score alone to talk about positions in
absolute terms. We need to know the actual mean and standard deviation of the dataset too.

## Exercises

**Exercise 1.** Housing prices in a certain city have a mean of $325,300 with a standard
deviation of $42,600. What's the `z`-score for a house whose price is $282,700?

**Exercise 2.** For the dataset in Exercise 1, what would be the price of a house with a
`z`-score of -0.65?