A downloadable research

As layers are chained together in a pipeline where each layer has knowledge on how to decode the information passed to it from the previous layer and how to process it to gain value that ultimately leads to a prediction. Thus, we hypothesise that on one hand it may be beneficial to copy consecutive layers from the teacher to the student, as they can already decode each other's output. However, copying layers that are very separated may copy knowledge on different processing steps while their connections can be learnt more easily.

More information

Status	Released
Category	Other
Author	roksanagow

Download

Neural_Network_Knowledge_Distillation_Importance_of_Layer_Selection (2).pdf 278 kB

Distillation by duplication: The importance of layer selection

Download

Leave a comment