A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling

Ankur Agrawal, Sae Kyu Lee, Joel Silberman, Matthew Ziegler, Mingu Kang, Swagath Venkataramani, Nianzheng Cao, Bruce Fleischer, Michael Guillorn, Matthew Cohen, Silvia Mueller, Jinwook Oh, Martin Lutz, Jinwook Jung, Siyu Koswatta, Ching Zhou, Vidhi Zalani, James Bonanno, Robert Casatuta, Chia Yu ChenJungwook Choi, Howard Haynie, Alyssa Herbert, Radhika Jain, Monodeep Kar, Kyu Hyoun Kim, Yulong Li, Zhibin Ren, Scot Rider, Marcel Schaal, Kerstin Schelm, Michael Scheuermann, Xiao Sun, Hung Tran, Naigang Wang, Wei Wang, Xin Zhang, Vinay Shah, Brian Curran, Vijayalakshmi Srinivasan, Pong Fei Lu, Sunil Shukla, Leland Chang, Kailash Gopalakrishnan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm2) in AI hardware accelerators across cloud and edge platforms. However, robust deep learning (DL) model accuracy equivalent to high-precision computation must be maintained. Improvements in bandwidth, architecture, and power management are also required to harness the benefit of reduced precision by feeding and supporting more parallel engines to achieve high sustained utilization and optimize performance within a given product power envelope. In this work, we present a 4-core AI chip in 7nm EUV technology that exploits cutting-edge algorithmic advances for iso-accurate models in low-precision training and inference [1, 2] and aggressive circuit/architecture optimization to achieve leading-edge power-performance. The chip supports fp16 (DLFIoat16 [8]) and hybrid-fp8(hfp8) [1] formats for training and inference of DL models, as well as int4 and int2 formats for highly scaled inference.

Original languageEnglish
Title of host publication2021 IEEE International Solid-State Circuits Conference, ISSCC 2021 - Digest of Technical Papers
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages144-146
Number of pages3
ISBN (Electronic)9781728195490
DOIs
StatePublished - 2021 Feb 13
Event2021 IEEE International Solid-State Circuits Conference, ISSCC 2021 - San Francisco, United States
Duration: 2021 Feb 132021 Feb 22

Publication series

NameDigest of Technical Papers - IEEE International Solid-State Circuits Conference
Volume64
ISSN (Print)0193-6530

Conference

Conference2021 IEEE International Solid-State Circuits Conference, ISSCC 2021
Country/TerritoryUnited States
CitySan Francisco
Period21/02/1321/02/22

Fingerprint

Dive into the research topics of 'A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling'. Together they form a unique fingerprint.

Cite this