English  |  正體中文  |  简体中文  |  Items with full text/Total items : 888/888 (100%)
Visitors : 13628326      Online Users : 273
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://ccur.lib.ccu.edu.tw/handle/A095B0000Q/337

    Title: 考量既定資源限制之卷積神經網路硬體設計取捨;Trade-offs and Optimization Strategies for Resource-Limited Convolutional Neural Network Hardware Design
    Authors: 陳昱丞;CHEN, YU-CHENG
    Contributors: 電機工程研究所
    Keywords: 加速器;Roofline模型;卷積神經網路;FPGA;設計權衡;Accelerator;Roofline Model;Convolutional Neural Network;FPGA;Design Trade-offs
    Date: 2016
    Issue Date: 2019-07-17
    Publisher: 電機工程研究所
    Abstract: 近年來,由於深度學習的發展,卷積神經網路(convolutional neural network,CNN)已經廣泛地被應用在圖像辨識上。但因為本身的計算較為複雜,因此會需要硬體化來滿足某些應用上的效能需求。由於FPGA具有高性能、可重新配置、開發快速的優點,各種基於FPGA平台的硬體加速器被提了出來。但硬體加速器可能的設計方案太多,若沒有好好考量的話,加速器可能會因為沒有充分利用邏輯資源或是記憶體頻寬,而無法達到最佳性能。而且每個FPGA平台的資源限制皆不同,若要將設計好的硬體移植到別的FPGA上,可能會遇到資源不夠或是無法有效運用資源的情形。因此我們將卷積神經網路的設計利用循環分塊(loop tiling)等技術優化,並量化計算吞吐率、所需的記憶體頻寬、資源使用率。然後在Roofline模型[11]的幫助下,找到最佳性能和最低FPGA資源需求的設計方案。所以在移植到新的FPGA平台之前,可以先依照新平台的資源限制事先調整設計。若整個FPGA除了卷積神經網路外還有其他硬體的話,也能夠考量要分配多少資源給其他硬體使用,然後改變卷積神經網路的設計。
    Recently, the convolutional neural network (CNN) has been widely used in deep learning for many challenging tasks, such as image recognition. Because of complicated calculations, CNN often needs to be implemented on FPGA, GPU or ASIC to meet the performance requirement. Among these realization alternatives, FPGA has been accredited for high performance, reconfigurable, and short development. Consequently, FPGA based CNN accelerators deserve good optimization strategies in order to achieve high performance under logic, memory, and I/O bandwidth constraints.In this regard, we propose to use loop tiling and subsequently to calculate the throughput, memory bandwidth, and resource usage, all under the Roofline model. As such, we can easily find trade-off among various design parameters. Moreover, the proposed methodology can be quickly adapted to other platforms for the same purpose of prototyping CNN accelerators in FPGA.
    Appears in Collections:[電機工程研究所] 學位論文

    Files in This Item:

    File Description SizeFormat

    All items in CCUR are protected by copyright, with all rights reserved.

    版權聲明 © 國立中正大學圖書館網頁內容著作權屬國立中正大學圖書館


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback