Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen:
doi:10.22028/D291-44695
Titel: | Identifying optimal substrate classes of membrane transporters |
VerfasserIn: | Denger, Andreas Helms, Volkhard |
Sprache: | Englisch |
Titel: | PloS One |
Bandnummer: | 19 |
Heft: | 12 |
Verlag/Plattform: | Plos |
Erscheinungsjahr: | 2024 |
DDC-Sachgruppe: | 500 Naturwissenschaften |
Dokumenttyp: | Journalartikel / Zeitschriftenartikel |
Abstract: | Membrane transporters are responsible for moving a wide variety of molecules across biological membranes, making them integral to key biological pathways in all organisms. Identifying all membrane transporters within a (meta-)proteome, along with their specific substrates, provides important information for various research fields, including biotechnology, pharmacology, and metabolomics. Protein datasets are frequently annotated with thousands of molecular functions that form complex networks, often with partial or full redundancy and hierarchical relationships. This complexity, along with the low sample count for more specific functions, makes them unsuitable as classes for supervised learning methods, meaning that the creation of an optimal subset of annotations is required. However, selection of this subset requires extensive manual effort, along with knowledge about the biology behind the respective functions. Here, we present an automated pipeline to address this problem. Unlike previous approaches for reducing redundancy in GO datasets, we employ machine learning to identify a subset of functional annotations in a training dataset. Classes in the resulting predictive model meet four essential criteria: sufficient sample size for training predictive models, minimal redundancy, strong class separability, and relevance to substrate transport. Furthermore, we implemented a pipeline for creating training datasets of transmembrane transporters that cover a wide range of organisms, including plants, bacteria, mammals, and single-cell eukaryotes. For a dataset containing 98.1% of transporters from S. cerevisiae, the pipeline automatically reduced the number of functional annotations from 287 to 11 GO terms that could be classified with a median pairwise F1 score of 0.87±0.16. For a meta-organism dataset containing 96% of all transport proteins from S. cerevisiae, A. thaliana, E. coli and human, the number of classes was reduced from 695 to 49, with a median F1 score of 0.92±0.10 between pairs of GO terms. When lowering the percentage of covered proteins down to 67%, the pipeline found a subset of 30 GO terms with a median F1 score of 0.95±0.06. |
DOI der Erstveröffentlichung: | 10.1371/journal.pone.0315330 |
URL der Erstveröffentlichung: | https://doi.org/10.1371/journal.pone.0315330 |
Link zu diesem Datensatz: | urn:nbn:de:bsz:291--ds-446958 hdl:20.500.11880/39811 http://dx.doi.org/10.22028/D291-44695 |
ISSN: | 1932-6203 |
Datum des Eintrags: | 18-Mär-2025 |
Bezeichnung des in Beziehung stehenden Objekts: | Supporting information |
In Beziehung stehendes Objekt: | https://doi.org/10.1371/journal.pone.0315330.s001 https://doi.org/10.1371/journal.pone.0315330.s002 https://doi.org/10.1371/journal.pone.0315330.s003 https://doi.org/10.1371/journal.pone.0315330.s004 https://doi.org/10.1371/journal.pone.0315330.s005 https://doi.org/10.1371/journal.pone.0315330.s006 https://doi.org/10.1371/journal.pone.0315330.s007 https://doi.org/10.1371/journal.pone.0315330.s008 https://doi.org/10.1371/journal.pone.0315330.s009 https://doi.org/10.1371/journal.pone.0315330.s010 https://doi.org/10.1371/journal.pone.0315330.s011 https://doi.org/10.1371/journal.pone.0315330.s012 https://doi.org/10.1371/journal.pone.0315330.s013 https://doi.org/10.1371/journal.pone.0315330.s014 |
Fakultät: | NT - Naturwissenschaftlich- Technische Fakultät |
Fachrichtung: | NT - Biowissenschaften |
Professur: | NT - Prof. Dr. Volkhard Helms |
Sammlung: | SciDok - Der Wissenschaftsserver der Universität des Saarlandes |
Dateien zu diesem Datensatz:
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
journal.pone.0315330.pdf | 2,06 MB | Adobe PDF | Öffnen/Anzeigen |
Diese Ressource wurde unter folgender Copyright-Bestimmung veröffentlicht: Lizenz von Creative Commons