Generalized overlap regions for communication optimization in data-parallel programs

Document Type

Conference Proceeding

Publication Date

1-1-1997

Abstract

Data-parallel languages such as High Performance Fortran, Vienna Fortran and Fortran D include directives for alignment and distribution that describe how data and computation are mapped onto the processors in a distributed-memory multiprocessor. A compiler for these language that generates code for each processor has to compute the sequence of local memory addresses accessed by each processor and the sequence of sends and receives for a given processor to access non-local data. While the address generation problem has received much attention, issues in communication have not been dealt with extensively. A novel approach for the management of communication sets and strategies for local storage of remote references is presented. Algorithms for deriving communication patterns are discussed first. Then, two schemes that extend the notion of a local array by providing storage for non-local elements (called overlap regions) interspersed throughout the storage for the local portion are presented. The two schemes, namely course padding and column padding enhance locality of reference significantly at the cost of a small overhead due to unpacking of messages. The performance of these schemes are compared to the traditional buffer-based approach and improvements of up to 30% in total time are demonstrated. Several message optimizations such as offset communication, message aggregation and coalescing are also discussed.

Publication Source (Journal or Book title)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

First Page

404

Last Page

419

This document is currently not available here.

Share

COinS