Document Type
Conference Proceeding
Publication Date
1-1-2010
Abstract
Today's multi-core era places significant demands on an optimizing compiler, which must parallelize programs, exploit memory hierarchy, and leverage the ever-increasing SIMD capabilities of modern processors. Existing model-based heuristics for performance optimization used in compilers are limited in their ability to identify profitable parallelism/locality trade-offs and usually lead to sub-optimal performance. To address this problem, we distinguish optimizations for which effective model-based heuristics and profitability estimates exist, from optimizations that require empirical search to achieve good performance in a portable fashion. We have developed a completely automatic framework in which we focus the empirical search on the set of valid possibilities to perform fusion/code motion, and rely on model-based mechanisms to perform tiling, vectorization and parallelization on the transformed program. We demonstrate the effectiveness of this approach in terms of strong performance improvements on a single target as well as performance portability across different target architectures. © 2010 IEEE.
Publication Source (Journal or Book title)
2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
Recommended Citation
Pouchet, L., Bondhugula, U., Bastoul, C., Cohen, A., Ramanujam, J., & Sadayappan, P. (2010). Combined iterative and model-driven optimization in an automatic parallelization framework. 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010 https://doi.org/10.1109/SC.2010.14