Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen: doi:10.22028/D291-44443
Titel: Improving in-silico normalization using read weights
VerfasserIn: Durai, Dilip A.
Schulz, Marcel H.
Sprache: Englisch
Titel: Scientific reports
Bandnummer: 9
Heft: 1
Verlag/Plattform: Springer Nature
Erscheinungsjahr: 2019
DDC-Sachgruppe: 004 Informatik
Dokumenttyp: Journalartikel / Zeitschriftenartikel
Abstract: Specialized de novo assemblers for diverse datatypes have been developed and are in widespread use for the analyses of single-cell genomics, metagenomics and RNA-seq data. However, assembly of large sequencing datasets produced by modern technologies is challenging and computationally intensive. In-silico read normalization has been suggested as a computational strategy to reduce redundancy in read datasets, which leads to significant speedups and memory savings of assembly pipelines. Previously, we presented a set multi-cover optimization based approach, ORNA, where reads are reduced without losing important k-mer connectivity information, as used in assembly graphs. Here we propose extensions to ORNA, named ORNA-Q and ORNA-K, which consider a weighted set multi-cover optimization formulation for the in-silico read normalization problem. These novel formulations make use of the base quality scores obtained from sequencers (ORNA-Q) or k-mer abundances of reads (ORNA-K) to improve normalization further. We devise efficient heuristic algorithms for solving both formulations. In applications to human RNA-seq data, ORNA-Q and ORNA-K are shown to assemble more or equally many full length transcripts compared to other normalization methods at similar or higher read reduction values. The algorithm is implemented under the latest version of ORNA (v2.0, https://github.com/SchulzLab/ORNA ).
DOI der Erstveröffentlichung: 10.1038/s41598-019-41502-9
URL der Erstveröffentlichung: https://www.nature.com/articles/s41598-019-41502-9
Link zu diesem Datensatz: urn:nbn:de:bsz:291--ds-444433
hdl:20.500.11880/39712
http://dx.doi.org/10.22028/D291-44443
ISSN: 2045-2322
Datum des Eintrags: 24-Feb-2025
Fakultät: MI - Fakultät für Mathematik und Informatik
Fachrichtung: MI - Informatik
Professur: MI - Keiner Professur zugeordnet
Sammlung:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Dateien zu diesem Datensatz:
Datei Beschreibung GrößeFormat 
s41598-019-41502-9.pdf1,11 MBAdobe PDFÖffnen/Anzeigen


Diese Ressource wurde unter folgender Copyright-Bestimmung veröffentlicht: Lizenz von Creative Commons Creative Commons