Ethics and Statistics
Statistics is an important field because it helps us to understand the world so we can make decisions. Some of these decisions are micro-level decisions, like those surrounding the creditworthiness or the insurability of individual applicants. Others are larger decisions, like the US Federal Reserve assessing macroeconomic trends and making adjustments to monetary policy.
Given the importance of data and statistics in everyday life, it's no surprise that ethical issues arise. Decisions backed by statistics and machine learning (ML) impact the entire world population on a daily basis. It's important for both consumers and producers of statistical and ML methods to understand the associated ethical issues to avoid problems ranging from invalid conclusions to immoral and even illegal actions.
This is a large topic, and I can't hope to do it justice here. But I'll touch upon a couple of key topics: misleading uses of statistics and biased datasets.
Misleading uses of statistics
Statistics is complicated. The data are hard to see and the analyses and conclusions are unintuitive. Yet the appetite for statistical analyses is such that nearly everybody consumes them on a regular basis, and many people—both laypeople and experts—produce analyses too. This creates a situation where we need to be on guard against misleading uses of statistics.
More often than not, the improper use is unintended, as beginners, being beginners, will make rookie mistakes, such as confusing causation and correlation. But even more experienced practitioners make mistakes. Statistics is a very human endeavor.
But in some cases, people produce intentionally deceptive statistics. Some examples:
- a student excluding the summer class she failed at the local community college when reporting her grade point average on a scholarship application
- a researcher repeating an experiment a large number of times, with possibly minor variations, and then reporting on the one time it succeeded (also known as "data dredging" or "p-hacking")
- a survey author asking questions in such a way as to nudge respondents in a certain direction (e.g., "Would you agree that...")
- a media organization using misleading charts to promote a political opinion
Let's start with a classic example from xkcd.
Here are some hypothetical examples of misleading charts. They're similar to real world examples, but I'm creating my own since the real ones are often political in nature, and I want to avoid that here.
Example: Misleading bar charts
Here's a bar chart showing the popularity of two hypothetical political candidates with an electorate:
Candidate A is much more popular than candidate B, right? But do you see anything odd?
Look at the y-axis. Notice how it starts at 46.5%. What happens if we change the axis so that it starts at 0%?
This presents a totally different picture. Candidate A is still more popular, but just barely.
But even with this improvement, this second chart is still somewhat misleading, because it looks like both candidates are pretty popular. That's because the top value on the y-axis is 60%. When we change it to 100%, a clearer picture emerges:
Now we can readily see that each candidate is popular with roughly half the electorate, with candidate A enjoying a small advantage.
I've seen misleading bar charts like the above many times, even in the "official" news media. It isn't always intentional. In fact, Excel's default auto-scaling produced the first chart of the three charts. But sometimes it's absolutely intentional. Either way, it's important to pay attention to bar chart scales to avoid being duped.
Let's look at one example of a misleading chart.
Example: A misleading pictograph
In 2016, the year-end incarceration rate per 100,000 adults for Texas was 1,050, and for Washington state it was 530. (Source) Here's a pictograph that presents these data:
See any issues?
Well, one major issue is the following: even though the count for Texas is essentially twice that of Washington state, the prisoner icon is four times larger! The reason is that doubling the prisoner width and height squares the area. (We double the height of the Texas icon because the count is twice that of Washington state, and we double the width to maintain visual proportionality.) This compares Texas and Washington in a highly misleading way.
These examples give you a sense for how we can mislead one another with statistics, whether intentionally or not. This happens all the time, and it's key to be vigilant when producing and consuming statistics.
Now let's consider a second area in the intersection between ethics and statistics, which is biased datasets.
A second area of concern in statistical (and machine learning) methods is biased datasets. This is where the dataset either includes or else fails to include information that leads to decisions that are problematic in some way: possibly misleading, unfair or even illegal. Such bias can additionally lead to public relations nightmares for the associated organizations.
One common case is where a dataset doesn't contain sufficiently diverse data. Statistical and machine learning models depend crucially on the data they consume, and if certain types of data are limited in number or missing, or have include dubious correlations, the model's performance will generally suffer.
Here is a very partial set of examples of the phenomena in question:
- In 2015, the Google Photos app misclassified two black people as gorillas.
- In 2016, Microsoft's Tay AI chatbot was designed to learn to converse by chatting with Twitter users. But then Tay started learning hate speech from users.
- In 2020, Google's Vision AI interpreted a white person holding a hand-held thermometer as holding a monocular and a black person holding the same object as holding a gun. Neither classification is accurate, owing to the limited number of hand-held thermometers in the training set. But the differing interpretation according to skin tone is even more problematic.
Obviously, the errors above were unintentional and deeply embarrassing to the companies in question. Misbehavior of the sort described above is hurtful and can perpetuate negative stereotypes. In other cases, incorrect model performance can lead to wrongful and potentially illegal rejection of lending or insurance applications, improper assessment of risk of criminal recidivism, and so forth.
Users of statistical and machine learning methods should be aware of the issues surrounding biased datasets, and be on guard against such bias.