TY - GEN
T1 - PCM
AU - Kim, Boyeal
AU - Lee, Sang Hyun
AU - Kim, Hyun
AU - Nguyen, Duy Thanh
AU - Le, Minh Son
AU - Chang, Ik Joon
AU - Kwon, Dohun
AU - Yoo, Jin Hyeok
AU - Choi, Jun Won
AU - Lee, Hyuk Jae
N1 - Publisher Copyright:
© 2020 EDAA.
PY - 2020/3
Y1 - 2020/3
N2 - Deep neural network (DNN) training suffers from the significant energy consumption in memory system, and most existing energy reduction techniques for memory system have focused on introducing low precision that is compatible with computing unit (e.g., FP16, FP8). These researches have shown that even in learning the networks with FP16 data precision, it is possible to provide training accuracy as good as FP32, de facto standard of the DNN training. However, our extensive experiments show that we can further reduce the data precision while maintaining the training accuracy of DNNs, which can be obtained by truncating some least significant bits (LSBs) of FP16, named as hard approximation. Nevertheless, the existing hard-ware structures for DNN training cannot efficiently support such low precision. In this work, we propose a novel memory system architecture for GPUs, named as precision-controlled memory system (PCM), which allows for flexible management at the level of hard approximation. PCM provides high DRAM bandwidth by distributing each precision to different channels with as transposed data mapping on DRAM. In addition, PCM supports fine-grained hard approximation in the L1 data cache using software-controlled registers, which can reduce data movement and thereby improve energy saving and system performance. Furthermore, PCM facilitates the reduction of data maintenance energy, which accounts for a considerable portion of memory energy consumption, by controlling refresh period of DRAM. The experimental results show that in training CIFAR-100 dataset on Resnet-20 with precision tuning, PCM achieves energy saving and performance enhancement by 66% and 20%, respectively, without loss of accuracy.
AB - Deep neural network (DNN) training suffers from the significant energy consumption in memory system, and most existing energy reduction techniques for memory system have focused on introducing low precision that is compatible with computing unit (e.g., FP16, FP8). These researches have shown that even in learning the networks with FP16 data precision, it is possible to provide training accuracy as good as FP32, de facto standard of the DNN training. However, our extensive experiments show that we can further reduce the data precision while maintaining the training accuracy of DNNs, which can be obtained by truncating some least significant bits (LSBs) of FP16, named as hard approximation. Nevertheless, the existing hard-ware structures for DNN training cannot efficiently support such low precision. In this work, we propose a novel memory system architecture for GPUs, named as precision-controlled memory system (PCM), which allows for flexible management at the level of hard approximation. PCM provides high DRAM bandwidth by distributing each precision to different channels with as transposed data mapping on DRAM. In addition, PCM supports fine-grained hard approximation in the L1 data cache using software-controlled registers, which can reduce data movement and thereby improve energy saving and system performance. Furthermore, PCM facilitates the reduction of data maintenance energy, which accounts for a considerable portion of memory energy consumption, by controlling refresh period of DRAM. The experimental results show that in training CIFAR-100 dataset on Resnet-20 with precision tuning, PCM achieves energy saving and performance enhancement by 66% and 20%, respectively, without loss of accuracy.
KW - Approximate Computing
KW - Deep Neural Network
KW - General Purpose Graphic Processing Unit
KW - High Bandwidth Memory
KW - Precision Control
KW - Refresh Period Control
UR - http://www.scopus.com/inward/record.url?scp=85087382861&partnerID=8YFLogxK
U2 - 10.23919/DATE48585.2020.9116530
DO - 10.23919/DATE48585.2020.9116530
M3 - Conference contribution
AN - SCOPUS:85087382861
T3 - Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020
SP - 1199
EP - 1204
BT - Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020
A2 - Di Natale, Giorgio
A2 - Bolchini, Cristiana
A2 - Vatajelu, Elena-Ioana
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 9 March 2020 through 13 March 2020
ER -