- 초록
- With increasing size of transformer-based neural networks, a light-weight algorithm and efficient AI accelerator has been developed to train these huge networks in practical design time. In this article, we present a survey of state-of-the-art research on the low-precision computational algorithms especially for floating-point formats and their hardware accelerator. We describe the trends by focusing on the work of two leading research groups-IBM and Seoul National University-which have deep knowledge in both AI algorithm and hardware architecture. For the low-precision algorithm, we summarize two efficient floating-point formats (hybrid FP8 and radix-4 FP4) with accuracy-preserving algorithms for training on the main research stream. Moreover, we describe the AI processor architecture supporting the low-bit mixed precision computing unit including the integer engine.
Ⅰ. 서론
Ⅱ. 저정밀도 학습의 배경
Ⅲ. 저정밀도 데이터 타입 기술
Ⅳ. 저정밀도 연산 AI 반도체 기술
Ⅴ. 결론 및 맺음말
약어 정리
자세한 내용은 첨부파일을 참고해 주시기 바랍니다.
