English  |  正體中文  |  简体中文  |  Items with full text/Total items : 889/889 (100%)
Visitors : 14578714      Online Users : 16
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://ccur.lib.ccu.edu.tw/handle/A095B0000Q/82

    Title: 以具適應性種子偵測法進行無序列比對之三代定序錯誤校正;Alignment-Free Error Correction for Third-Generation Sequencing by Adaptive Seed Identification
    Authors: 李冠緯;LEE, KUAN-WEI
    Contributors: 資訊工程研究所
    Keywords: 第三代定序;非序列比對錯誤修正法;third-generation sequencing;alignment-free error correction
    Date: 2018
    Issue Date: 2019-05-23 10:30:24 (UTC+8)
    Publisher: 資訊工程研究所
    Abstract: 因為第三代定序技術所產生的序列為較長的序列、定序的偏差較低與定序分布平均等特性,使得第三代定序技術成為現有基因組裝(de novo assembly)的熱門選項。但是由於它產出的序列錯誤率較高,所以在進行基因組裝前必須進行序列的錯誤修正。目前錯誤修正的方法可分為比列序列分析法,如Canu,和非比對列序分析法;兩者面臨著準確度與速度之間的妥協。我們在之前研發出了一個利用FM-Index的非比對錯誤修正法,稱為FILEC,但是在中至大物種的組裝完整度上仍不讓人滿意。在這篇論文裡,我們提出了新的方法來提高FILEC在高相似度區域的種子準確率。方法首先會把高相似度與低相似度的區域區分開來,並且動態的使用不同的策略找種子;接下來錯誤的種子會被修剪與移除。實驗結果顯示出我們的方法相較於Canu不僅比較快,也保證了組裝的完整度與正確性。雖然在大物種組裝會變的破碎,速度上仍然比Canu快。
    The thrid-generation sequencing technology is now commonly used for de novo assembly projects because of longer reads, less sequencing bias, and more uniform coverage. However, it comes at the cost of higher error rate, which requires error correction prior to assembly. The correction algorithms are divided into alignment-based methods like Canu, and alignment-free methods, which face the tradeoff between accuracy and speed. We previously developed an alignment-free algorithm based on FM-index, named FILEC, but the assembly contiguity is unsatisfactory in moderate and large genomes. In this thesis, we propose a new method to improve seeding accuracy of FILEC in repeat regions. The proposed method distinguishes unique and repetitive regions and adaptively uses different seeding strategies. The remaining error seeds were trimmed until the errors were removed. The experiment results showed that our method runs much faster than Canu and guarantees the contiguity and concordance of assembly . In large genome dataset, although the assembly result becomes fragmented, our method is still faster than Canu.
    Appears in Collections:[資訊工程學系] 學位論文

    Files in This Item:

    File Description SizeFormat

    All items in CCUR are protected by copyright, with all rights reserved.

    版權聲明 © 國立中正大學圖書館網頁內容著作權屬國立中正大學圖書館


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback