English  |  正體中文  |  简体中文  |  Items with full text/Total items : 888/888 (100%)
Visitors : 13005076      Online Users : 173
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://ccur.lib.ccu.edu.tw/handle/A095B0000Q/58

    Title: 利用層級的資訊來作知識轉移的學習;Layer Leveled Knowledge Distillation for Deep Neural Network Learning
    Authors: 林仕杰;LIN, SHIH-CHIEH
    Contributors: 資訊工程研究所
    Keywords: 知識蒸餾;深度學習;卷積神經網路;Knowledge Distillation;Deep Learning;CNN
    Date: 2018
    Issue Date: 2019-05-23 10:30:19 (UTC+8)
    Publisher: 資訊工程研究所
    Abstract: 隨著深度學習的普及和計算的提高功率,神經網絡變得越來越大。儘管模型很複雜,深度模型培訓仍然存在兩個挑戰。一個是昂貴的計算成本,另一個是可能有不足或無標籤的數據。最近研究家提出的目的在實現的知識蒸餾方法的推動,獲得小而快速執行的模型。在本文中,我們建議使用用於深度模型學習的輔助結構,通過對齊層將輔助模型的權重遷移到小模型。然後,我們使用預先訓練的參數來訓練小模型。在本文中,我們結合兩個評估來決定哪個卷積或完全連接的層具有更有用的信息:分層的類間矩陣和層間克數矩陣。分層的類間矩陣表示類別之間的特徵的相關程度而層間克矩陣表示蒸餾的知識,將複雜網絡所提取的特徵做內積以獲得層與層之間的關係。實驗結果證明使用以上這兩個矩陣所決定的層擁有最有價值的資訊,得到模型訓練的最佳性能。
    With the popularity of deep learning and the improvement of computing power, neural network becomes deeper and bigger. Despite the complexity of model,there are two challenges remain for deep model training. One is expensive computational costs and the other is there may be insu fficient or unlabeled data to train or adapt a deep architecture to new tasks. In addition to the above two challenges, motivated by the recently proposed knowledge distillation approach that aims at obtaining small and fast-to-execute models. In this paper, we propose to use the auxiliary structure for deep model learning with insu fficient data via additional alignment layers to migrate the weights of the auxiliary model to a small model.Then, we use the pre-trained parameters to train the small model. In this paper, we combine two evaluations to decide which convolutional or fully-connected layer has more useful information: layered inter-class matrix and inter-layered gram matrix.The layered inter-class matrix indicates the degree of correlation between features of di fferent classes, while the inter-layered gram matrix represents the distilled knowledge from the complex network consisting of the inner products between features from ant two layers of the model. The experimental results demonstrate the superior performance of model training using the proposed auxiliary structure.
    Appears in Collections:[資訊工程學系] 學位論文

    Files in This Item:

    File Description SizeFormat

    All items in CCUR are protected by copyright, with all rights reserved.

    版權聲明 © 國立中正大學圖書館網頁內容著作權屬國立中正大學圖書館


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback