ATT: A Fault-Tolerant ReRAM Accelerator for Attention-based Neural Networks

Document Type

Conference Proceeding

Publication Date

10-1-2020

Abstract

Crossbar-based resistive RAM has been widely used in deep learning accelerator designs because it largely eliminates weight movement between memory and processing units. The high-density storage and low leakage power make it a good fit for edge/IoT devices. However, existing ReRAM designs for traditional neural networks cannot support Attention-based Neural Networks, which are stacked with encoders and decoders instead of convolutional layers or fully connected layers. In addition to matrix-matrix multiplications in traditional neural networks, an encoder or a decoder also includes the attention mechanism, the layer normalization and the gaussian error linear unit. These new characteristics make the data flow far more complicated than that of a convolutional layer. Faulty ReRAM devices are additional obstacles when mapping weights that severely degrade computation accuracy. Existing hardware redundancy strategies that are unaware of application characteristics usually result in inefficient designs. In this work, we analyze the data flow of these attention-based neural networks and propose a ReRAM-based accelerator with a dedicated pipeline design for Attention-based Neural Networks. When considering cells with hard faults in crossbars, we further propose NuXG, a non-uniform redundancy strategy, to meet accuracy requirements and save energy consumption by decreasing the redundancy ratio. Finally, we evaluate results and demonstrate that the proposed can achieve more than two times improved performance over existing redundancy schemes in both power efficiency and throughput for Attention-based Neural Networks. Moreover, it also significantly outperforms an NVIDIA GPU.

Publication Source (Journal or Book title)

Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors

First Page

213

Last Page

221

This document is currently not available here.

Plum Print visual indicator of research metrics
PlumX Metrics
  • Citations
    • Citation Indexes: 8
  • Usage
    • Abstract Views: 1
  • Captures
    • Readers: 13
see details

Share

COinS