Adaptive parallel tiled code generation and accelerated auto-tuning

Document Type

Conference Proceeding

Publication Date

11-1-2013

Abstract

Tiling is an important program transformation that is often used to enhance cache locality and to obtain coarse-grained parallelism. In this paper, we address the problem of generating adaptive parametric tiled code for parallel execution contexts; in other words, generating parallel tiled code in which tile sizes can be changed on the fly during execution. Changing of tile sizes during pipelined parallel execution of tiles presents the following fundamental code-generation challenge: the unscanned iteration space may become non-convex. We develop novel solutions for the adaptive parallel tiled code generation problem. Using adaptive tiling, auto-tuning for tile size selection can be accelerated: in a single run of the tiled code, several tile sizes may be tested for their performance and thus expedite auto-tuning. Adaptive tiling is also useful in scenarios where tile sizes need to be dynamically altered to tailor to the changing execution environments, such as dynamically resized caches for power savings. Experimental evaluation on a number of benchmarks demonstrates the effectiveness of the developed approach. © The Author(s) 2013.

Publication Source (Journal or Book title)

International Journal of High Performance Computing Applications

First Page

412

Last Page

425

This document is currently not available here.

Share

COinS