Skip Navigation

The Computer Journal 1998 41(1):26-44; doi:10.1093/comjnl/41.1.26
© 1998 by British Computer Society
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Moon, S.-M.
Right arrow Articles by Ebcioglu, K.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

The Performance Impact of Exploiting Branch ILP with Tree Representation of ILP Code

S.-M. Moon1 and K. Ebcioglu2

1 School of Electrical Engineering, Seoul National University, Seoul, Korea Email: smoon{at}altair.snu.ac.kr, 2 IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA

Modern single-CPU microprocessors exploit instruction-level parallelism (ILP) by deriving their performance advantage mainly from parallel execution of ALU and memory instructions within a single clock cycle. This performance advantage obtained by exploiting data ILP is severely offset by sequential execution of conditional branches, especially in branch-intensive non-numerical code. Consequently, branch ILP must also be exploited by executing branches and data instructions in parallel. This requires compilation support for scheduling branches as well as architectural support for executing branches and data instructions in the same cycle. This paper performs a comprehensive empirical study aimed at evaluating the performance impact of exploiting branch ILP using a representation of ILP code called tree representation, which has been proposed by Nicolau [A. Nicolau (1985), Technical Report TR-85-678, Cornell University, Ithaca, NY] and Ebcioglu to exploit branch ILP in the most generalized form. Our results indicate that exploiting branch ILP can enhance performance substantially (i.e., as much as a geometric mean of speedup 4.5 in the 16-ALU machine, compared to the base speedup 3.0) and that the performance benefit comes not only from the intended parallel execution but from the decrease of useless speculative execution due to earlier scheduling of branches.


Received August 2, 1996. revised November 10, 1997.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.