Efficient AI System Design with Cross-Layer Approximate Computing

Swagath Venkataramani, Xiao Sun, Naigang Wang, Chia Yu Chen, Jungwook Choi, Mingu Kang, Ankur Agarwal, Jinwook Oh, Shubham Jain, Tina Babinsky, Nianzheng Cao, Thomas Fox, Bruce Fleischer, George Gristede, Michael Guillorn, Howard Haynie, Hiroshi Inoue, Kazuaki Ishizaki, Michael Klaiber, Shih Hsien LoGary Maier, Silvia Mueller, Michael Scheuermann, Eri Ogawa, Marcel Schaal, Mauricio Serrano, Joel Silberman, Christos Vezyrtzis, Wei Wang, Fanchieh Yee, Jintao Zhang, Matthew Ziegler, Ching Zhou, Moriyoshi Ohara, Pong Fei Lu, Brian Curran, Sunil Shukla, Vijayalakshmi Srinivasan, Leland Chang, Kailash Gopalakrishnan

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Advances in deep neural networks (DNNs) and the availability of massive real-world data have enabled superhuman levels of accuracy on many AI tasks and ushered the explosive growth of AI workloads across the spectrum of computing devices. However, their superior accuracy comes at a high computational cost, which necessitates approaches beyond traditional computing paradigms to improve their operational efficiency. Leveraging the application-level insight of error resilience, we demonstrate how approximate computing (AxC) can significantly boost the efficiency of AI platforms and play a pivotal role in the broader adoption of AI-based applications and services. To this end, we present RaPiD, a multi-tera operations per second (TOPS) AI hardware accelerator core (fabricated at 14-nm technology) that we built from the ground-up using AxC techniques across the stack including algorithms, architecture, programmability, and hardware. We highlight the workload-guided systematic explorations of AxC techniques for AI, including custom number representations, quantization/pruning methodologies, mixed-precision architecture design, instruction sets, and compiler technologies with quality programmability, employed in the RaPiD accelerator.

Original languageEnglish
Article number9253640
Pages (from-to)2232-2250
Number of pages19
JournalProceedings of the IEEE
Volume108
Issue number12
DOIs
StatePublished - 2020 Dec
Externally publishedYes

Keywords

  • Approximate computing (AxC)
  • artificial intelligence (AI)
  • deep neural networks (DNNs)
  • hardware acceleration

Fingerprint

Dive into the research topics of 'Efficient AI System Design with Cross-Layer Approximate Computing'. Together they form a unique fingerprint.

Cite this