Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Survey of Quantization Methods for Efficient Neural Network Inference

2022·1.000 ZitationenOpen Access

Volltext beim Verlag öffnen

1.000

Zitationen

Autoren

2022

Jahr

Abstract

This chapter provides approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. Over the past decade, people have observed significant improvements in the accuracy of Neural Networks (NNs) for a wide range of problems, often achieved by highly over-parameterized models. Achieving efficient, real-time NNs with optimal accuracy requires rethinking the design, training, and deployment of NN models. Model distillation involves training a large model and then using it as a teacher to train a more compact model. Loosely related to NN quantization is work in neuroscience that suggests that the human brain stores information in a discrete/quantized form, rather than in a continuous form. Gray and Neuhoff have written a very nice survey of the history of quantization up to 1998.

Autoren

Institutionen

University of California, Berkeley(US)

Themen

Advanced Neural Network ApplicationsNeural Networks and ApplicationsMedical Image Segmentation Techniques

Volltext beim Verlag öffnen

A Survey of Quantization Methods for Efficient Neural Network Inference

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen