Create a Correlation Matrix in Python

Your goal

You need to calculate a correlation matrix for a set of numerical variables in Python. The variables are stored in a Pandas DataFrame.

Step-by-step tutorial

In Python, you can use either Pandas or NumPy to generate correlation matrices. We'll use Pandas since we're already assuming a Pandas DataFrame.

>>> import pandas as pd
>>> precip = pd.read_csv("precip-central-park.csv")
>>> precip
     YEAR   JAN   FEB   MAR   APR   MAY   JUN   JUL   AUG   SEP   OCT   NOV   DEC  ANNUAL
0    1869  2.53  6.87  4.61  1.39  4.15  4.40  3.20  1.76  2.81  6.48  2.03  5.02   45.25
1    1870  4.41  2.83  3.33  5.11  1.83  2.82  3.76  3.07  2.52  4.97  2.42  2.18   39.25
2    1871  2.07  2.72  5.54  3.03  4.04  7.05  5.57  5.60  2.34  7.50  3.56  2.24   51.26
3    1872  1.88  1.29  3.74  2.29  2.68  2.93  7.83  6.29  2.95  3.35  4.08  3.18   42.49
4    1873  5.34  3.80  2.09  4.16  3.69  1.28  4.61  9.56  3.14  2.73  4.63  2.96   47.99
..    ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...     ...
146  2015  5.23  2.04  4.72  2.08  1.86  4.79  3.98  2.35  3.28  3.91  2.01  4.72   40.97
147  2016  4.41  4.40  1.17  1.61  3.75  2.60  7.02  1.97  2.79  4.15  5.41  2.89   42.17
148  2017  4.83  2.48  5.25  3.84  6.38  4.76  4.19  3.34  2.00  4.18  1.58  2.21   45.04
149  2018  2.18  5.83  5.17  5.78  3.53  3.11  7.45  8.59  6.19  3.59  7.62  6.51   65.55
150  2019  3.58  3.14  3.87  4.55  6.82  5.46  5.77  3.70  0.95  6.15  1.95  7.09   53.03

[151 rows x 14 columns]
>>> precip.corr()
            YEAR       JAN       FEB       MAR       APR  ...       SEP       OCT       NOV       DEC    ANNUAL
YEAR    1.000000  0.030511 -0.101125  0.111149  0.194854  ...  0.079725  0.069104  0.143469  0.214377  0.294144
JAN     0.030511  1.000000 -0.068217 -0.102626  0.111274  ... -0.011665  0.035299 -0.118907  0.034769  0.137708
FEB    -0.101125 -0.068217  1.000000 -0.025060 -0.082795  ... -0.038986  0.027231 -0.055307  0.046860  0.124756
MAR     0.111149 -0.102626 -0.025060  1.000000  0.216825  ...  0.018001  0.147579 -0.026707  0.132642  0.321145
APR     0.194854  0.111274 -0.082795  0.216825  1.000000  ... -0.002176  0.175281 -0.012373  0.257741  0.467535
MAY     0.232004  0.030127  0.021829 -0.021762  0.085648  ...  0.050845 -0.080315  0.009115  0.143836  0.331190
JUN     0.197471 -0.085672  0.058147  0.026989  0.055814  ... -0.031940  0.134995  0.018000  0.141974  0.376922
JUL     0.007274 -0.146235 -0.067612 -0.056091  0.050942  ...  0.030097 -0.042856  0.121552  0.085505  0.304247
AUG     0.008248  0.057705 -0.012762  0.041751  0.079984  ...  0.068532  0.120382 -0.023199  0.045182  0.426935
SEP     0.079725 -0.011665 -0.038986  0.018001 -0.002176  ...  1.000000 -0.071484 -0.055561 -0.033720  0.305505
OCT     0.069104  0.035299  0.027231  0.147579  0.175281  ... -0.071484  1.000000  0.048813  0.081066  0.436943
NOV     0.143469 -0.118907 -0.055307 -0.026707 -0.012373  ... -0.055561  0.048813  1.000000 -0.029956  0.232270
DEC     0.214377  0.034769  0.046860  0.132642  0.257741  ... -0.033720  0.081066 -0.029956  1.000000  0.429293
ANNUAL  0.294144  0.137708  0.124756  0.321145  0.467535  ...  0.305505  0.436943  0.232270  0.429293  1.000000

[14 rows x 14 columns]