© 2002 by British Computer Society
Fault-Tolerant Hierarchical Networks for Shared Memory Multiprocessors and their Bandwidth Analysis
1 Department of Electrical and Computer Engineering Wayne State University, Detroit, MI 48202, USA Email: smahmud@ece.eng.wayne.edu 2 A short version of this paper was presented at the IEEE International Conference on Algorithms and Architectures for Parallel Processing, Brisbane, Australia, April 1921, 1995.
Many researchers have paid significant attention to the design of cluster-based systems, due to the fact that such systems need very inexpensive networks compared to those needed for non-cluster-based systems. A number of hierarchical interconnection networks (HINs) have also been proposed in the literature which can be used for building large cluster-based systems. Most of the existing HINs are not fault tolerant. It is very desirable that a HIN be fault tolerant, because even a single fault in the network can completely disconnect a large number of processors and/or memory modules from the rest of the processors and memory modules of the system. As a result, the performance of the system will decrease significantly. In this paper, we have proposed two types of hierarchical interconnection networks which are fault tolerant and can be used to build large cluster-based multiprocessor systems. We have also developed analytical models to determine the performance of the proposed fault-tolerant HINs under fault-free and faulty conditions. Simulation models were also developed to verify the accuracy of the analytical models. The results obtained from the analytical models were found to be very close to those obtained from the simulation models. The technique that has been used to develop models in this paper can also be used to develop models for other hierarchical systems.