© 2004 by British Computer Society
Comparing the Optimal Performance of Parallel Architectures


Department of Software Engineering and Computer Science, Blekinge Institute of Technology, Soft Center, S-372 25 Ronneby, Sweden
Consider a parallel program with n processes and a synchronization granularity z. Consider also two parallel architectures: an SMP with q processors and run-time reallocation of processes to processors, and a distributed system (or cluster) with k processors and no run-time reallocation. There is an inter-processor communication delay of t time units for the system with no run-time reallocation. In this paper, we define a function H(n,k,q,t,z) such that the minimum completion time for all programs with n processes and a granularity z is at most H(n,k,q,t,z) times longer using the system with no reallocation and k processors compared to using the system with q processors and run-time reallocation. We assume optimal allocation and scheduling of processes to processors. The function H(n,k,q,t,z) is optimal in the sense that there is at least one program, with n processes and a granularity z, such that the ratio is exactly H(n,k,q,t,z). We also validate our results using measurements on distributed and multiprocessor Sun/Solaris environments. The function H(n,k,q,t,z) provides important insights regarding the performance implications of the fundamental design decision of whether to allow run-time reallocation of processes or not. These insights can be used when doing the proper cost/benefit trade-offs when designing parallel execution platforms.
Received 5 May 2003. Revised 20 February 2004.
* Email: Kamilla.Klonowska{at}bth.se
Email: Lars.Lundberg{at}bth.se
¶ Email: Hakan.Lennerstad{at}bth.se
Email: