Markov Chain Transition Matrices

Estimating Markov Chain Transition Matrices in Limited Data Samples: A Monte Carlo Experiment

Authors: Wang B, Tsang K, Garrison L
Conference: International Society for Pharmacoeconomics and Outcomes Research
Location: Washington, DC, USA
Year: 2012

Objective: Markov models are often used in Health Economics to represent disease progression in Cost-Utility models. The transition probabilities, however, may be difficult to populate when the data are limited. This note applies the Markov matrix approximation method using vector autoregression (VAR) to estimate the transition matrix when the sample size is small.

Methods: We compare the performance of the standard (count) method versus the VAR method to estimate transition probabilities in small samples. For the count method, one counts the transitions from state to any other state in the data and then divides the counts by the number of occurrences for each . The VAR method follows Tauchen (1986) and Terry and Knotek (2011). We compare the two methods using Monte Carlo simulations by generating small samples from different data generating processes (DGPs) and comparing the mean squared errors made by each method versus the true transition matrix. We employ two DGPs to populate the entries of our underlying transition probability matrices in our study: 1) A normal distribution with large variance (DPG1), and 2) a uniform distribution with small variance (DGP2). We then normalize each row so they sum to 1.

Results: In DGP1, the VAR outperforms the count method in small samples (N = 10 or 30) and the count method marginally outperforms the VAR method in the large sample (N = 50). For DGP2, VAR outperforms in small samples and both methods perform similarly in the large sample. We propose a combination of the two methods by increasing the weight on the count method when the sample size increases relative to the size of the matrix.

Conclusions: By applying this methodology in Health Economics modeling, it allows the researcher to utilize Markov models in situations previously infeasible due to a paucity of data.