Calculate the Mean in R

Your goal

You need to calculate the mean of a numeric dataset in R.

Step-by-step tutorial

The approach depends on the type of representation you have.

Vector data

For numerical vector data, we can use the mean function from the R base package.

> data <- c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)
> mean(data)
[1] 4

Data frame

For a standard data frame, here's how to compute all column means:

> precip <- read.csv("precip-central-park.csv")
> head(precip)
  YEAR  JAN  FEB  MAR  APR  MAY  JUN  JUL  AUG  SEP  OCT  NOV  DEC ANNUAL
1 1869 2.53 6.87 4.61 1.39 4.15 4.40 3.20 1.76 2.81 6.48 2.03 5.02  45.25
2 1870 4.41 2.83 3.33 5.11 1.83 2.82 3.76 3.07 2.52 4.97 2.42 2.18  39.25
3 1871 2.07 2.72 5.54 3.03 4.04 7.05 5.57 5.60 2.34 7.50 3.56 2.24  51.26
4 1872 1.88 1.29 3.74 2.29 2.68 2.93 7.83 6.29 2.95 3.35 4.08 3.18  42.49
5 1873 5.34 3.80 2.09 4.16 3.69 1.28 4.61 9.56 3.14 2.73 4.63 2.96  47.99
6 1874 5.33 2.04 2.12 8.77 2.24 2.78 5.06 2.43 8.24 1.70 2.30 2.82  45.83
> colMeans(precip)
       YEAR         JAN         FEB         MAR         APR         MAY
1944.000000    3.500861    3.339470    3.999735    3.716358    3.717947
        JUN         JUL         AUG         SEP         OCT         NOV
   3.684040    4.328212    4.394834    3.827682    3.717815    3.554106
        DEC      ANNUAL
   3.661523   45.445894

For a single column:

> mean(precip[, "ANNUAL"])
[1] 45.44589

Tibble

To get all column means:

> library(readr)
> precip <- read_csv("precip-central-park.csv")
Parsed with column specification:
cols(
  YEAR = col_double(),
  JAN = col_double(),
  FEB = col_double(),
  MAR = col_double(),
  APR = col_double(),
  MAY = col_double(),
  JUN = col_double(),
  JUL = col_double(),
  AUG = col_double(),
  SEP = col_double(),
  OCT = col_double(),
  NOV = col_double(),
  DEC = col_double(),
  ANNUAL = col_double()
)
> head(precip)
# A tibble: 6 x 14
   YEAR   JAN   FEB   MAR   APR   MAY   JUN   JUL   AUG   SEP   OCT   NOV   DEC
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  1869  2.53  6.87  4.61  1.39  4.15  4.4   3.2   1.76  2.81  6.48  2.03  5.02
2  1870  4.41  2.83  3.33  5.11  1.83  2.82  3.76  3.07  2.52  4.97  2.42  2.18
3  1871  2.07  2.72  5.54  3.03  4.04  7.05  5.57  5.6   2.34  7.5   3.56  2.24
4  1872  1.88  1.29  3.74  2.29  2.68  2.93  7.83  6.29  2.95  3.35  4.08  3.18
5  1873  5.34  3.8   2.09  4.16  3.69  1.28  4.61  9.56  3.14  2.73  4.63  2.96
6  1874  5.33  2.04  2.12  8.77  2.24  2.78  5.06  2.43  8.24  1.7   2.3   2.82
# … with 1 more variable: ANNUAL <dbl>
> colMeans(precip)
       YEAR         JAN         FEB         MAR         APR         MAY
1944.000000    3.500861    3.339470    3.999735    3.716358    3.717947
        JUN         JUL         AUG         SEP         OCT         NOV
   3.684040    4.328212    4.394834    3.827682    3.717815    3.554106
        DEC      ANNUAL
   3.661523   45.445894

To get a single column mean, we need to use double-bracket notation:

> mean(precip[["ANNUAL"]])
[1] 45.44589