Work with Discrete Time Markov Chains (DTMC) in R

Your goal

You need to work with discrete time Markov chains (DTMC) in R, including:

  • create a Markov chain
  • visualize a Markov chain
  • solve a Markov chain
  • generate a sequence of states from a Markov chain
  • estimate a transition matrix from an observed sequence of states

Step-by-step tutorial

For this tutorial we'll use an example from chapter 10 of the book Performance by Design: Computer Capacity Planning by Example, by Daniel A. Menascé, et al. (Prentice Hall). In this example, a lad spends each day pursuing one of four different activities. The Markov chain is discrete state and discrete time.

The workflow we'll cover here is the following:

A Markov chain workflow
A Markov chain workflow

However, we're going to go slightly out of order. We'll start by assuming that we already have the state transition matrix (i.e., we don't need to estimate the state transition matrix), and show how to use it to calculate the equilibrium (steady-state) distribution and how to generate a random sequence of states according to the Markov chain's underlying stationary distribution. After that we'll circle back to fitting a Markov model to an observed sequence of state transitions.

Create a Markov chain

To create a Markov chain, define the states, the state transition matrix, and the Markov chain itself.

First we create the states and state transition matrix:

> lad.states <- c('Drinking', 'Sightseeing', 'Kayaking', 'Hiking')
> lad.transitions <- matrix(
+ c(0.4, 0.6, 0.0, 0.0, 0.2, 0.0, 0.0, 0.8, 0.1, 0.0, 0.7, 0.2, 0.3, 0.0, 0.2, 0.5),
+ nrow=4, byrow=F, dimnames=list(lad.states, lad.states))
> lad.transitions
            Drinking Sightseeing Kayaking Hiking
Drinking         0.4         0.2      0.1    0.3
Sightseeing      0.6         0.0      0.0    0.0
Kayaking         0.0         0.0      0.7    0.2
Hiking           0.0         0.8      0.2    0.5

Build the Markov chain itself using the markovchain package. To install the package:

> install.packages('markovchain')

Then create the object as follows:

> library(markovchain)
Package:  markovchain
Version:  0.8.5-2
Date:     2020-09-07
BugReport: https://github.com/spedygiorgio/markovchain/issues

> lad.mc <- new('markovchain', states=lad.states, byrow=F,
+ transitionMatrix=lad.transitions, name='A Lad')
> lad.mc
A Lad
 A  4 - dimensional discrete Markov Chain defined by the following states:
 Drinking, Sightseeing, Kayaking, Hiking
 The transition matrix  (by cols)  is defined as follows:
            Drinking Sightseeing Kayaking Hiking
Drinking         0.4         0.2      0.1    0.3
Sightseeing      0.6         0.0      0.0    0.0
Kayaking         0.0         0.0      0.7    0.2
Hiking           0.0         0.8      0.2    0.5

Visualize a Markov chain

Visualize the Markov chain (or more precisely, its state transition matrix) using the diagram package. To install the package:

> install.packages('diagram')

Then use the plotmat function to plot the matrix:

> library(diagram)
Loading required package: shape
> plotmat(lad.transitions, pos=c(2, 2), self.cex=0.5, shadow.size=0)
Markov chain for a lad's daily activities
Markov chain for a lad's daily activities

Solve a Markov chain

The markovchain package has a steadyState function to allow you to calculate the chain's equilibrium distribution (i.e., to "solve" the chain):

> steadyStates(lad.mc)
                 [,1]
Drinking    0.2644231
Sightseeing 0.1586538
Kayaking    0.2307692
Hiking      0.3461538

Generate a random sequence of states from a Markov chain

Use the markovchainSequence function from the markovchain package to generate a random sequence of states from the chain, in accordance with its underlying stationary distribution:

> markovchainSequence(100, lad.mc, t0='Drinking', include.t0=T)
  [1] "Drinking"    "Sightseeing" "Hiking"      "Hiking"      "Hiking"
  [6] "Hiking"      "Hiking"      "Hiking"      "Hiking"      "Drinking"
 [11] "Drinking"    "Sightseeing" "Hiking"      "Kayaking"    "Kayaking"
 [16] "Kayaking"    "Drinking"    "Sightseeing" "Hiking"      "Drinking"
 [21] "Sightseeing" "Hiking"      "Hiking"      "Kayaking"    "Kayaking"
 [26] "Drinking"    "Sightseeing" "Hiking"      "Hiking"      "Drinking"
 [31] "Drinking"    "Drinking"    "Sightseeing" "Drinking"    "Sightseeing"
 [36] "Drinking"    "Sightseeing" "Hiking"      "Hiking"      "Hiking"
 [41] "Drinking"    "Drinking"    "Sightseeing" "Hiking"      "Kayaking"
 [46] "Drinking"    "Drinking"    "Sightseeing" "Hiking"      "Drinking"
 [51] "Drinking"    "Sightseeing" "Hiking"      "Drinking"    "Sightseeing"
 [56] "Hiking"      "Hiking"      "Drinking"    "Drinking"    "Drinking"
 [61] "Sightseeing" "Drinking"    "Sightseeing" "Drinking"    "Sightseeing"
 [66] "Hiking"      "Hiking"      "Hiking"      "Hiking"      "Drinking"
 [71] "Sightseeing" "Drinking"    "Sightseeing" "Hiking"      "Kayaking"
 [76] "Kayaking"    "Hiking"      "Hiking"      "Hiking"      "Hiking"
 [81] "Drinking"    "Sightseeing" "Hiking"      "Hiking"      "Hiking"
 [86] "Hiking"      "Drinking"    "Sightseeing" "Hiking"      "Hiking"
 [91] "Hiking"      "Drinking"    "Sightseeing" "Hiking"      "Hiking"
 [96] "Hiking"      "Drinking"    "Sightseeing" "Hiking"      "Drinking"
[101] "Drinking"

Estimate a transition matrix from a sequence of states

Use markovchainFit to estimate a transition matrix from a sequence of states:

> lad.seq.df <- read.csv('lad-states.dat')
> lad.transitions.fit <- markovchainFit(t(lad.seq.df))
> lad.transitions.fit
$estimate
Unnamed Markov chain
 A  4 - dimensional discrete Markov Chain defined by the following states:
 Drinking, Hiking, Kayaking, Sightseeing
 The transition matrix  (by rows)  is defined as follows:
             Drinking    Hiking   Kayaking Sightseeing
Drinking    0.3103448 0.0000000 0.00000000   0.6896552
Hiking      0.3170732 0.5853659 0.09756098   0.0000000
Kayaking    0.3750000 0.1250000 0.50000000   0.0000000
Sightseeing 0.2380952 0.7619048 0.00000000   0.0000000


$standardError
              Drinking    Hiking   Kayaking Sightseeing
Drinking    0.10344828 0.0000000 0.00000000   0.1542116
Hiking      0.08794028 0.1194873 0.04878049   0.0000000
Kayaking    0.21650635 0.1250000 0.25000000   0.0000000
Sightseeing 0.10647943 0.1904762 0.00000000   0.0000000

$confidenceLevel
[1] 0.95

$lowerEndpointMatrix
              Drinking    Hiking    Kayaking Sightseeing
Drinking    0.10758989 0.0000000 0.000000000    0.387406
Hiking      0.14471336 0.3511750 0.001952956    0.000000
Kayaking    0.00000000 0.0000000 0.010008902    0.000000
Sightseeing 0.02939935 0.3885782 0.000000000    0.000000

$upperEndpointMatrix
             Drinking    Hiking  Kayaking Sightseeing
Drinking    0.5130998 0.0000000 0.0000000   0.9919044
Hiking      0.4894330 0.8195567 0.1931690   0.0000000
Kayaking    0.7993447 0.3699955 0.9899911   0.0000000
Sightseeing 0.4467911 1.0000000 0.0000000   0.0000000