Performance modeling and optimization of parallel out-of-core tensor contractions
Document Type
Conference Proceeding
Publication Date
12-1-2005
Abstract
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor contraction expressions arising in quantum chemistry applications modeling electronic structure. This paper develops a performance model for tensor contractions, considering both disk I/O as well as inter-processor communication costs, to facilitate performance-model driven loop optimization for this domain. Experimental results are provided that demonstrate the accuracy and effectiveness of the model. Copyright 2005 ACM.
Publication Source (Journal or Book title)
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
First Page
266
Last Page
276
Recommended Citation
Gao, X., Sahoo, S., Lam, C., Ramanujam, J., Lu, Q., Baumgartner, G., & Sadayappan, P. (2005). Performance modeling and optimization of parallel out-of-core tensor contractions. Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, 266-276. https://doi.org/10.1145/1065944.1065980