A downloadable research

As layers are chained together in a pipeline where each layer has knowledge on how to decode the information passed to it from the previous layer and how to process it to gain value that ultimately leads to a prediction. Thus, we hypothesise that on one hand it may be beneficial to copy consecutive layers from the teacher to the student, as they can already decode each other's output. However, copying layers that are very separated may copy knowledge on different processing steps while their connections can be learnt more easily.

Download

Download
Neural_Network_Knowledge_Distillation_Importance_of_Layer_Selection (2).pdf 278 kB

Leave a comment

Log in with itch.io to leave a comment.