Memory-constrained communication minimization for a class of array computations
Document Type
Conference Proceeding
Publication Date
12-1-2005
Abstract
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions involving large multidimensional arrays. The efficient computation of complex tensor contractions usually requires the generation of temporary intermediate arrays. These intermediates could be extremely large, but they can often be generated and used in batches through appropriate loop fusion transformations. To optimize the performance of such computations on parallel computers, the total amount of inter-processor communication must be minimized, subject to the available memory on each processor. In this paper, we address the memory-constrained communication minimization problem in the context of this class of computations. Based on a framework that models the relationship between loop fusion and memory usage, we develop an approach to identify the best combination of loop fusion and data partitioning that minimizes inter-processor communication cost without exceeding the per-processor memory limit. The effectiveness of the developed optimization approach is demonstrated on a computation representative of a component used in quantum chemistry suites. © Springer-Verlag Berlin Heidelberg 2005.
Publication Source (Journal or Book title)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
First Page
1
Last Page
15
Recommended Citation
Cociorva, D., Baumgartner, G., Lam, C., Sadayappan, P., & Ramanujam, J. (2005). Memory-constrained communication minimization for a class of array computations. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2481 LNCS, 1-15. https://doi.org/10.1007/11596110_1