In: IEEE Transactions on Software Engineering, Vol. 16, No. 4, pages 444-457. April 1990.
Abstract: Fault tolerance in hierarchically distributed systems is studied using stochastic Petri nets to examine different fault-tolerant schemes required by different levels of the hierarchy. Parameterized subnet primitives are used along with stochastic Petri nets to model fault-tolerant schemes, both centralized and distributed. An arbitrary checkpointing strategy and a planned strategy are used to study distributed fault tolerance, and the effect of integration on fault tolerance strategies at various hierarchy levels is examined.
Keywords: hierarchical distributed system; fault-tolerance; stochastic net; checkpointing strategy.