An explorer.
An explorer. Source

Course Overview

The course is organized as follows.

Descriptive statistics. We'll begin with descriptive statistics. There are lots of different ways we can approach the problem of describing a dataset, and two key ways are in terms of central tendency and spread: that is, what's the center of the dataset, and how spread out are the data? We'll consider both numeric and graphical techniques for understanding central tendency and spread.

Besides central tendency and spread, position, shape and relationships are useful ways to think about and characterize a dataset. Position involves looking at the way data are ordered. Shape has to do with looking at the distribution of values across a dataset. Relationships concern analyzing multiple variables in a dataset at once. We'll investigate all of these concepts as well.

Probability. After descriptive statistics, we want to get to inferential statistics, but we need to lay some groundwork first. This groundwork is elementary probability theory. We'll cover the key concepts and rules that you'll need.

Inferential statistics. Finally, we'll tackle inferential statistics. Here's we'll learn how to make judgments about larger populations based on more limited data samples, and how to place bounds on our confidence in those judgments. Part of this will include building mathematical models that allow us to explain and predict what's going on with populations. The other will be hypothesis testing to validate or reject claims about the population.

As you might guess, there are many software packages available for statistical analysis. This course isn't tied to a particular approach, since we're really focusing more on the concepts and the mathematics. Still, the software packages are useful and indeed even indispensable for real work. So the course includes tutorials for some of the major packages and tools that people use for statistical work, including Excel, Python and R.

Before we dive into the mathematical material, let's pause for a moment to consider some important ethical issues arising in the study and use of statistics.