OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 13.03.2026, 01:52

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Survey of Quantization Methods for Efficient Neural Network Inference

2022·971 ZitationenOpen Access
Volltext beim Verlag öffnen

971

Zitationen

6

Autoren

2022

Jahr

Abstract

This chapter provides approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. Over the past decade, people have observed significant improvements in the accuracy of Neural Networks (NNs) for a wide range of problems, often achieved by highly over-parameterized models. Achieving efficient, real-time NNs with optimal accuracy requires rethinking the design, training, and deployment of NN models. Model distillation involves training a large model and then using it as a teacher to train a more compact model. Loosely related to NN quantization is work in neuroscience that suggests that the human brain stores information in a discrete/quantized form, rather than in a continuous form. Gray and Neuhoff have written a very nice survey of the history of quantization up to 1998.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Advanced Neural Network ApplicationsNeural Networks and ApplicationsMedical Image Segmentation Techniques
Volltext beim Verlag öffnen