Please use this identifier to cite or link to this item: doi:10.22028/D291-26565
Title: Phylogenetics from paralogs
Author(s): Hellmuth, Marc
Wieseke, Nicolas
Lechner, Markus
Lenhof, Hans-Peter
Middendorf, Martin
Stadler, Peter F.
Language: English
Year of Publication: 2014
SWD key words: Bioinformatik
Free key words: phylogeny
rooted triples
integer linear program
DDC notations: 004 Computer science, internet
Publikation type: Journal Article
Abstract: Motivation: Sequence-based phylogenetic approaches heavily rely on initial data sets to be composed of orthologous sequences only. Paralogs are treated as a dangerous nuisance that has to be detected and removed. Recent advances in mathematical phylogenetics, however, have indicated that gene duplications can also convey meaningful phylogenetic information provided orthologs and paralogs can be distinguished with a degree of certainty. Results: We demonstrate that plausible phylogenetic trees can be inferred from paralogy information only. To this end, tree-free estimates of orthology, the complement of paralogy, are first corrected to conform cographs and then translated into equivalent event-labeled gene phylogenies. A certain subset of the triples displayed by these trees translates into constraints on the species trees. While the resolution is very poor for individual gene families, we observe that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees of several groups of eubacteria. The novel method introduced here relies on solving three intertwined NP-hard optimization problems: the cograph editing problem, the maximum consistent triple set problem, and the least resolved tree problem. Implemented as Integer Linear Program, paralogy-based phylogenies can be computed exactly for up to some twenty species and their complete protein complements. Availability:The ILP formulation is implemented in the Software ParaPhylo using IBM ILOG CPLEX (TM) Optimizer 12.6 and is freely available from
Link to this record: urn:nbn:de:bsz:291-scidok-57969
Date of registration: 27-May-2014
Faculty: MI - Fakultät für Mathematik und Informatik
Department: MI - Informatik
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
File Description SizeFormat 
phylogenetics_from_paralogs_workprint.pdf597,55 kBAdobe PDFView/Open

Items in SciDok are protected by copyright, with all rights reserved, unless otherwise indicated.