Search : [ keyword: Model Compression ] (2)

Conditional Knowledge Distillation for Model Specialization

Hakbin Kim, Dong-Wan Choi

http://doi.org/10.5626/JOK.2021.48.4.369

Many recent works on model compression in neural networks are based on knowledge distillation (KD). However, since the basic goal of KD is to transfer the entire knowledge set of a teacher model to a student model, the standard KD may not represent the best use of the model’s capacity when a user wishes to classify only a small subset of classes. Also, it is necessary to possess the original teacher model dataset for KD, but for various practical reasons, such as privacy issues, the entire dataset may not be available. Thus, this paper proposes conditional knowledge distillation (CKD), which only distills specialized knowledge corresponding to a given subset of classes, as well as data-free CKD (DF-CKD), which does not require the original data. As a major extension, we devise Joint-CKD, which jointly performs DF-CKD and CKD with only a small additional dataset collected by a client. Our experimental results show that the CKD and DF-CKD methods are superior to standard KD, and also confirm that joint use of CKD and DF-CKD is effective at further improving the overall accuracy of a specialized model.

Compression of Korean Phrase Structure Parsing Model using Knowledge Distillation

Hyunsun Hwang, Changki Lee

http://doi.org/10.5626/JOK.2018.45.5.451

A sequence-to-sequence model is an end-to-end model that transforms an input sequence into an output sequence of different lengths. However, it is difficult to apply to an actual service by using techniques such as attention mechanism and input-feeding to achieve high performance. In this paper, we apply the sequence-level knowledge distillation for natural language processing to the Korean phrase structure parsing, which is an effective technique for compressing the model. Experimental results show that when the size of the hidden layer is decreased from 500 to 50, the performance of F1 0.56% is improved and the speed is 60.71 times faster than that of the baseline model.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr