Examine a Data Frame in R

Your goal

You need to gain a quick understanding of the structure and contents of a data frame in R.

Step-by-step tutorial

R has many functions you can use to investigate a data frame. Here are several of them:

Function Description
class Returns an object's class. For a data frame this is data.frame.
typeof Returns an object's (R internal) type or storage mode. For a data frame, this is list.
summary Summarizes an object, based on its class.
str Displays the internal structure of an R object.
dim Returns an integer vector containing the size of each of the object's dimensions. For a data frame, this is the number of rows and the number of columns.
length Returns the length of a vector. For a data frame, this is the same as ncol.
ncol Returns the number of rows in the data frame.
nrow Returns the number of rows in the data frame.
dimnames Returns for each dimension the name of its values.
names Returns the names of vector elements. For a data frame, this is the same as colnames.
colnames Returns the column names.
rownames Returns the row names.
attributes Returns an object's attributes.
head Returns the first n rows (n = 6 by default).
tail Returns the last n rows (n = 6 by default).

We'll use the precipitation dataset as our source of examples:

> precip.df <- read.csv('precip-central-park.csv')

class, typeof

> class(precip.df)
[1] "data.frame"
> typeof(precip.df)
[1] "list"

summary

> summary(precip.df)
      YEAR           JAN              FEB             MAR
 Min.   :1869   Min.   : 0.580   Min.   :0.460   Min.   : 0.800
 1st Qu.:1906   1st Qu.: 2.295   1st Qu.:2.390   1st Qu.: 2.805
 Median :1944   Median : 3.200   Median :3.050   Median : 3.710
 Mean   :1944   Mean   : 3.501   Mean   :3.339   Mean   : 4.000
 3rd Qu.:1982   3rd Qu.: 4.630   3rd Qu.:4.415   3rd Qu.: 5.035
 Max.   :2019   Max.   :10.520   Max.   :6.870   Max.   :10.690
      APR              MAY              JUN              JUL
 Min.   : 0.950   Min.   : 0.300   Min.   : 0.020   Min.   : 0.440
 1st Qu.: 2.385   1st Qu.: 2.245   1st Qu.: 2.485   1st Qu.: 2.805
 Median : 3.310   Median : 3.450   Median : 3.210   Median : 4.210
 Mean   : 3.716   Mean   : 3.718   Mean   : 3.684   Mean   : 4.328
 3rd Qu.: 4.720   3rd Qu.: 4.695   3rd Qu.: 4.615   3rd Qu.: 5.645
 Max.   :14.010   Max.   :10.240   Max.   :10.260   Max.   :11.890
      AUG              SEP              OCT              NOV
 Min.   : 0.180   Min.   : 0.210   Min.   : 0.140   Min.   : 0.340
 1st Qu.: 2.535   1st Qu.: 1.925   1st Qu.: 1.975   1st Qu.: 2.125
 Median : 3.920   Median : 3.140   Median : 3.350   Median : 3.310
 Mean   : 4.395   Mean   : 3.828   Mean   : 3.718   Mean   : 3.554
 3rd Qu.: 5.885   3rd Qu.: 4.825   3rd Qu.: 4.850   3rd Qu.: 4.390
 Max.   :18.950   Max.   :16.850   Max.   :16.730   Max.   :12.410
      DEC            ANNUAL
 Min.   :0.250   Min.   :26.09
 1st Qu.:2.325   1st Qu.:39.47
 Median :3.370   Median :44.55
 Mean   :3.662   Mean   :45.45
 3rd Qu.:4.645   3rd Qu.:49.24
 Max.   :9.980   Max.   :80.56

str

> str(precip.df)
'data.frame':	151 obs. of  14 variables:
 $ YEAR  : int  1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 ...
 $ JAN   : num  2.53 4.41 2.07 1.88 5.34 5.33 3.17 0.94 2.62 4.46 ...
 $ FEB   : num  6.87 2.83 2.72 1.29 3.8 2.04 2.62 4.81 1.24 3.75 ...
 $ MAR   : num  4.61 3.33 5.54 3.74 2.09 2.12 3.48 8.79 5.56 3.27 ...
 $ APR   : num  1.39 5.11 3.03 2.29 4.16 8.77 3.08 3.06 2.73 1.97 ...
 $ MAY   : num  4.15 1.83 4.04 2.68 3.69 2.24 1.33 3.03 0.95 3.19 ...
 $ JUN   : num  4.4 2.82 7.05 2.93 1.28 2.78 2.72 2.66 2.8 3.08 ...
 $ JUL   : num  3.2 3.76 5.57 7.83 4.61 5.06 4.89 3.65 5.73 4.62 ...
 $ AUG   : num  1.76 3.07 5.6 6.29 9.56 2.43 8.97 2.28 2.77 7.97 ...
 $ SEP   : num  2.81 2.52 2.34 2.95 3.14 8.24 1.89 5.28 1.33 4.05 ...
 $ OCT   : num  6.48 4.97 7.5 3.35 2.73 1.7 2.85 1.42 8.14 2.43 ...
 $ NOV   : num  2.03 2.42 3.56 4.08 4.63 2.3 3.78 3.31 5.62 4.73 ...
 $ DEC   : num  5.02 2.18 2.24 3.18 2.96 2.82 2.12 2.54 0.68 5.14 ...
 $ ANNUAL: num  45.2 39.2 51.3 42.5 48 ...

dim, length, ncol, nrow

> dim(precip.df)
[1] 151  14
> length(precip.df)
[1] 14
> ncol(precip.df)
[1] 14
> nrow(precip.df)
[1] 151

dimnames, names, colnames, rownames

> dimnames(precip.df)
1
  [1] "1"   "2"   "3"   "4"   "5"   "6"   "7"   "8"   "9"   "10"  "11"  "12"
 [13] "13"  "14"  "15"  "16"  "17"  "18"  "19"  "20"  "21"  "22"  "23"  "24"
 [25] "25"  "26"  "27"  "28"  "29"  "30"  "31"  "32"  "33"  "34"  "35"  "36"
 [37] "37"  "38"  "39"  "40"  "41"  "42"  "43"  "44"  "45"  "46"  "47"  "48"
 [49] "49"  "50"  "51"  "52"  "53"  "54"  "55"  "56"  "57"  "58"  "59"  "60"
 [61] "61"  "62"  "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71"  "72"
 [73] "73"  "74"  "75"  "76"  "77"  "78"  "79"  "80"  "81"  "82"  "83"  "84"
 [85] "85"  "86"  "87"  "88"  "89"  "90"  "91"  "92"  "93"  "94"  "95"  "96"
 [97] "97"  "98"  "99"  "100" "101" "102" "103" "104" "105" "106" "107" "108"
[109] "109" "110" "111" "112" "113" "114" "115" "116" "117" "118" "119" "120"
[121] "121" "122" "123" "124" "125" "126" "127" "128" "129" "130" "131" "132"
[133] "133" "134" "135" "136" "137" "138" "139" "140" "141" "142" "143" "144"
[145] "145" "146" "147" "148" "149" "150" "151"

2
 [1] "YEAR"   "JAN"    "FEB"    "MAR"    "APR"    "MAY"    "JUN"    "JUL"
 [9] "AUG"    "SEP"    "OCT"    "NOV"    "DEC"    "ANNUAL"

> names(precip.df)
 [1] "YEAR"   "JAN"    "FEB"    "MAR"    "APR"    "MAY"    "JUN"    "JUL"
 [9] "AUG"    "SEP"    "OCT"    "NOV"    "DEC"    "ANNUAL"
> colnames(precip.df)
 [1] "YEAR"   "JAN"    "FEB"    "MAR"    "APR"    "MAY"    "JUN"    "JUL"
 [9] "AUG"    "SEP"    "OCT"    "NOV"    "DEC"    "ANNUAL"
> rownames(precip.df)
  [1] "1"   "2"   "3"   "4"   "5"   "6"   "7"   "8"   "9"   "10"  "11"  "12"
 [13] "13"  "14"  "15"  "16"  "17"  "18"  "19"  "20"  "21"  "22"  "23"  "24"
 [25] "25"  "26"  "27"  "28"  "29"  "30"  "31"  "32"  "33"  "34"  "35"  "36"
 [37] "37"  "38"  "39"  "40"  "41"  "42"  "43"  "44"  "45"  "46"  "47"  "48"
 [49] "49"  "50"  "51"  "52"  "53"  "54"  "55"  "56"  "57"  "58"  "59"  "60"
 [61] "61"  "62"  "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71"  "72"
 [73] "73"  "74"  "75"  "76"  "77"  "78"  "79"  "80"  "81"  "82"  "83"  "84"
 [85] "85"  "86"  "87"  "88"  "89"  "90"  "91"  "92"  "93"  "94"  "95"  "96"
 [97] "97"  "98"  "99"  "100" "101" "102" "103" "104" "105" "106" "107" "108"
[109] "109" "110" "111" "112" "113" "114" "115" "116" "117" "118" "119" "120"
[121] "121" "122" "123" "124" "125" "126" "127" "128" "129" "130" "131" "132"
[133] "133" "134" "135" "136" "137" "138" "139" "140" "141" "142" "143" "144"
[145] "145" "146" "147" "148" "149" "150" "151"

attributes

> attributes(precip.df)
$names
 [1] "YEAR"   "JAN"    "FEB"    "MAR"    "APR"    "MAY"    "JUN"    "JUL"
 [9] "AUG"    "SEP"    "OCT"    "NOV"    "DEC"    "ANNUAL"

$class
[1] "data.frame"

$row.names
  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
 [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
 [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
 [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
 [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
 [91]  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107 108
[109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
[127] 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
[145] 145 146 147 148 149 150 151

head, tail

> head(precip.df)
  YEAR  JAN  FEB  MAR  APR  MAY  JUN  JUL  AUG  SEP  OCT  NOV  DEC ANNUAL
1 1869 2.53 6.87 4.61 1.39 4.15 4.40 3.20 1.76 2.81 6.48 2.03 5.02  45.25
2 1870 4.41 2.83 3.33 5.11 1.83 2.82 3.76 3.07 2.52 4.97 2.42 2.18  39.25
3 1871 2.07 2.72 5.54 3.03 4.04 7.05 5.57 5.60 2.34 7.50 3.56 2.24  51.26
4 1872 1.88 1.29 3.74 2.29 2.68 2.93 7.83 6.29 2.95 3.35 4.08 3.18  42.49
5 1873 5.34 3.80 2.09 4.16 3.69 1.28 4.61 9.56 3.14 2.73 4.63 2.96  47.99
6 1874 5.33 2.04 2.12 8.77 2.24 2.78 5.06 2.43 8.24 1.70 2.30 2.82  45.83
> tail(precip.df)
    YEAR  JAN  FEB  MAR  APR  MAY  JUN  JUL  AUG  SEP  OCT  NOV  DEC ANNUAL
146 2014 2.79 5.48 3.67 7.85 4.37 4.26 5.59 2.25 1.21 5.77 4.51 6.04  53.79
147 2015 5.23 2.04 4.72 2.08 1.86 4.79 3.98 2.35 3.28 3.91 2.01 4.72  40.97
148 2016 4.41 4.40 1.17 1.61 3.75 2.60 7.02 1.97 2.79 4.15 5.41 2.89  42.17
149 2017 4.83 2.48 5.25 3.84 6.38 4.76 4.19 3.34 2.00 4.18 1.58 2.21  45.04
150 2018 2.18 5.83 5.17 5.78 3.53 3.11 7.45 8.59 6.19 3.59 7.62 6.51  65.55
151 2019 3.58 3.14 3.87 4.55 6.82 5.46 5.77 3.70 0.95 6.15 1.95 7.09  53.03