Average Deviation

Another way to measure spread is the average deviation.

Definition. The average deviation is the mean distance from the mean:

$${AD={\frac {1}{n}}\sum _{i=1}^{n}\left| x_{i}-\bar{x} \right|}$$

where \(\bar{x}\) is the mean.

The average deviation, like many other statistical measures, has multiple names. Other names for the average deviation are average absolute deviation, mean deviation, and mean absolute deviation. They're all the same thing.

Here are some examples.

Example: Test scores for a reasonably challenging chemistry test

Let's continue working with the test scores from the previous sections. Recall the test scores for the chemistry test:

83, 87, 61, 92, 38, 78, 73, 55, 98, 74, 86, 69, 40, 83

To calculate the average deviation, first we need to calculate the mean. For that we get \(\bar{x} = \frac{1017}{14} \approx 72.64\).

Once we have the mean, we need to sum up each absolute difference (or "absolute deviation") between a value and the mean. Here's a table that makes this explicit:

\(x_{i}\) \(\bar{x}\) \(\left| x_{i}-\bar{x} \right|\)
83 72.64 10.36
87 72.64 14.36
61 72.64 11.64
92 72.64 19.36
38 72.64 34.64
78 72.64 5.36
73 72.64 0.36
55 72.64 17.64
98 72.64 25.36
74 72.64 1.36
86 72.64 13.36
69 72.64 3.64
40 72.64 32.64
83 72.64 10.36
Total: 200.43

Now that we have the sum of the absolute deviations, which is (approximately) 200.43. We divide that by \(n = 14\), which yields \(AD \approx 14.32\). This means that on average, the values in the dataset are about 14.32 units away from the dataset mean.

Example: Test scores for an easy math test

Recall the test scores for the math test from the previous section:

100, 100, 93, 92, 95, 98, 100, 100, 100, 95, 94, 88, 92

Here the dataset mean is \(\bar{x} = \frac{1247}{13} \approx 95.92\). I'll leave it as an exercise for you to calculate the average deviation. The end result is \(AD \approx 3.46\). As expected, this result is significantly smaller than the result for the chemistry test in the previous example.

The average deviation, like any measure, has strengths and weaknesses worth understanding.

Strengths of the average deviation as a measure of spread

Easy to understand. The idea of finding the mean distance from the mean is simple and makes intuitive sense. It's arguably the most intuitive way to measure the spread.

Directly incorporates all values in the dataset. Unlike the range, which incorporates only the highest and lowest values, the average deviation incorporates all values. This generally makes it a better reflection of the dataset than the range.

Matches our intuitions for many datasets. As with the range, for many datasets, the value matches our pre-formal notions of which sets ought to have higher vs. lower spreads.

Weaknesses of the average deviation as a measure of spread

The average deviation is highly scale-dependent. As with the range, the average deviation strongly depends on the size of the values in the dataset. Again this means that it's hard to compare average deviations for datasets having different scales.

The average deviation is sensitive to outliers (i.e., it isn't robust). The average deviation actually gets a double-whammy when it comes to lacking robustness. First, the per-value deviations all involve the mean, which isn't robust. Then at the end, we take the mean of all the deviations, which again isn't robust. The upshot is that outliers in the data can dramatically impact the average deviation.

Exercises

Exercise 1. Compute the average deviation of the following data:

202, 102, 285, 98

Does it seem to be a reasonable measurement of the spread for this dataset?

Exercise 2. Compute the average deviation of the following data:

185, 245, 205, 215, 3829, 190

Is it a reasonable measurement of the spread for this dataset?