Authors: Christopher Dabrowski, Fern Hunt
We describe how a Discrete Time Markov chain simulation and graph theory concepts can be used together to efficiently analyze behavior of complex distributed systems. Specifically, the paper shows how minimal s-t cut set analysis can be used to identify state transitions in a directed graph of a time-inhomogeneous Markov chain, which when suitably perturbed, lead to performance degradations in the system being modeled. These state transitions can be then be related to failure scenarios in which system performance declines catastrophically in the target system being modeled. Using a large-scale simulation of the grid system, we provide examples of the use of this approach to identify failure scenarios. Preliminary experiments are reported that show this approach can be applied to problems of significant size. The approach described here combines techniques whose use together to analyze dynamic system behavior has not previously been reported.