Parallel Implementation of String-Based Clustering for HT-SELEX Data

Kato, Shintaro and Ono, Takayoshi and Ito, Masaki and Ito, Koichi and Minagawa, Hirotaka and Horii, Katsunori and Shiratori, Ikuo and Waga, Iwao and Aoki, Takafumi (2021) Parallel Implementation of String-Based Clustering for HT-SELEX Data. EAI Endorsed Transactions on Bioengineering and Bioinformatics, 1 (1). e3. ISSN 2709-4111

[thumbnail of eai.19-10-2020.166664.pdf]
Available under License Creative Commons Attribution No Derivatives.

Download (2MB) | Preview


INTRODUCTION: A clustering method for HT-SELEX is crucial for selecting different types of aptamer candidates. We have developed FSBC method for HT-SELEX data implemented in R. FSBC exhibited the highest accuracy of sequence clustering compared with conventional methods, while the processing time of FSBC is longer than AptaCluster.

OBJECTIVES: The objective of this study is to improve the processing time of FSBC.

METHODS: We propose pFSBC, which reduces the processing time of ORS estimation in FSBC by introducing parallel implementation.

RESULTS: The processing time and clustering accuracy were evaluated with the last round of NCBI SRA data of SRR3279661 from BioProject PRJNA315881 comparing with other conventional clustering methods. We demonstrated that pFSBC exhibited the highest clustering accuracy and the shortest processing time.

CONCLUSION: We expect that pFSBC will help to avoid the time-consuming clustering task, and it will provide accurate clustering results for the HT-SELEX data.

Item Type: Article
Uncontrolled Keywords: sequence analysis, clustering, SELEX, next-generation sequencing, aptamer, parallel implementation
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
QA75 Electronic computers. Computer science
Depositing User: EAI Editor II.
Date Deposited: 03 Mar 2021 08:57
Last Modified: 03 Mar 2021 08:57

Actions (login required)

View Item
View Item