Calculate the Standard Deviation in Python

Your goal

You need to calculate the standard deviation in Python.

Step-by-step tutorial

The approach depends on the data representation you're working with.

Approach 1: List data

Calculate the sample standard deviation using the stdev function from the statistics module:

>>> import statistics
>>> data = [3.2, 4.8, 6.1, 2.4, 5.5]
>>> statistics.stdev(data)
1.5572411502397436

Calculate the population standard deviation using the pstdev function (same dataset as above):

>>> statistics.pstdev(data)
1.3928388277184118

Approach 2: Pandas DataFrame

Use DataFrame.std() with the ddof parameter set to 1 (that's the default) to calculate the sample standard deviation for all columns:

>>> import pandas as pd
>>> data = pd.read_csv("precip-central-park.csv")
>>> data
     YEAR   JAN   FEB   MAR   APR   MAY   JUN   JUL   AUG   SEP   OCT   NOV   DEC  ANNUAL
0    1869  2.53  6.87  4.61  1.39  4.15  4.40  3.20  1.76  2.81  6.48  2.03  5.02   45.25
1    1870  4.41  2.83  3.33  5.11  1.83  2.82  3.76  3.07  2.52  4.97  2.42  2.18   39.25
2    1871  2.07  2.72  5.54  3.03  4.04  7.05  5.57  5.60  2.34  7.50  3.56  2.24   51.26
3    1872  1.88  1.29  3.74  2.29  2.68  2.93  7.83  6.29  2.95  3.35  4.08  3.18   42.49
4    1873  5.34  3.80  2.09  4.16  3.69  1.28  4.61  9.56  3.14  2.73  4.63  2.96   47.99
..    ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...     ...
146  2015  5.23  2.04  4.72  2.08  1.86  4.79  3.98  2.35  3.28  3.91  2.01  4.72   40.97
147  2016  4.41  4.40  1.17  1.61  3.75  2.60  7.02  1.97  2.79  4.15  5.41  2.89   42.17
148  2017  4.83  2.48  5.25  3.84  6.38  4.76  4.19  3.34  2.00  4.18  1.58  2.21   45.04
149  2018  2.18  5.83  5.17  5.78  3.53  3.11  7.45  8.59  6.19  3.59  7.62  6.51   65.55
150  2019  3.58  3.14  3.87  4.55  6.82  5.46  5.77  3.70  0.95  6.15  1.95  7.09   53.03

[151 rows x 14 columns]
>>> data.std()
YEAR      43.734045
JAN        1.632221
FEB        1.451392
MAR        1.855721
APR        1.947661
MAY        1.984377
JUN        2.003749
JUL        2.246024
AUG        2.698207
SEP        2.644979
OCT        2.527183
NOV        2.093915
DEC        1.762683
ANNUAL     8.317330
dtype: float64

For the population standard deviation, we set ddof to 0:

>>> data.std(ddof=0)
YEAR      43.588989
JAN        1.626808
FEB        1.446578
MAR        1.849566
APR        1.941201
MAY        1.977795
JUN        1.997103
JUL        2.238575
AUG        2.689258
SEP        2.636206
OCT        2.518801
NOV        2.086970
DEC        1.756837
ANNUAL     8.289744
dtype: float64

You can do the same thing with a single column, based on Series.std():

>>> data['ANNUAL'].std() # sample standard deviation
8.31733012222257
>>> data['ANNUAL'].std(ddof=0) # population standard deviation
8.289743544992968