Applications, challenges and new perspectives on the analysis of transcriptional regulation using epigenomic and transcriptomic data

Schmidt, Florian

Please use this identifier to cite or link to this item: doi:10.22028/D291-28777

Title:	Applications, challenges and new perspectives on the analysis of transcriptional regulation using epigenomic and transcriptomic data
Author(s):	Schmidt, Florian
Language:	English
Year of Publication:	2019
DDC notations:	570 Life sciences, biology 004 Computer science, internet
Publikation type:	Dissertation
Abstract:	The integrative analysis of epigenomics and transcriptomics data is an active research field in Bioinformatics. New methods are required to interpret and process large omics data sets, as generated within consortia such as the International Human Epigenomics Consortium. In this thesis, we present several approaches illustrating how combined epigenomics and transcriptomics datasets, e.g. for differential or time series analysis, can be used to derive new biological insights on transcriptional regulation. In this work we focus on regulatory proteins called transcription factors (TFs), which are essential for orchestrating cellular processes. In our novel approaches, we combine epigenomics data, such as DNaseI-seq, predicted TF binding scores and gene-expression measurements in interpretable machine learning models. In joint work with our collaborators within and outside IHEC, we have shown that our methods lead to biological meaningful results, which could be validated with wet-lab experiments. Aside from providing the community with new tools to perform integrative analysis of epigenomics and transcriptomics data, we have studied the characteristics of chromatin accessibility data and its relation to gene-expression in detail to better understand the implications of both computational processing and of different experimental methods on data interpretation. Overall, we provide easy to use tools to enable researchers to benefit from the era of Biological Data Science. In dieser Dissertation stellen wir mehrere Ansätze vor, um die häufigsten "omics" Daten, wie beispielsweise differentielle Datenstze oder auch Zeitreihen zu verwenden, um neue Erkenntnisse über Genregulation auf transkriptioneller Ebene gewinnen zu können. Dabei haben wir uns insbesondere auf sogenannte Transkriptionsfaktoren konzentriert. Dies sind Proteine, die essentiell für die Steuerung regulatorischer Prozesse in der Zelle sind. In unseren neuen Methoden kombinieren wir epigenetische Daten, zum Beispiel DNaseI-seq oder ATAC-seq Daten, vorhergesagte Transkriptionsfaktorbindestellen und Genexpressionsdaten in interpretierbaren Modellen des maschinellen Lernens. Zusammen mit unseren Kooperationspartnern haben wir gezeigt, dass unsere Methoden zu biologisch bedeutsamen Ergebnissen führen, die exemplarisch im Labor validiert werden konnten. Ferner haben wir im Detail Zusammenhänge zwischen der Struktur des Chromatins und der Genexpression untersucht. Dies ist von immenser Bedeutung, um den Einfluss von experimentellen Charakteristika aber auch von der Modellierung der Daten auf die biologische Interpretation zu vermeiden.
Link to this record:	urn:nbn:de:bsz:291--ds-287773 hdl:20.500.11880/27902 http://dx.doi.org/10.22028/D291-28777
Advisor:	Schulz, Marcel Holger
Date of oral examination:	26-Aug-2019
Date of registration:	26-Sep-2019
Faculty:	MI - Fakultät für Mathematik und Informatik
Department:	MI - Informatik
Collections:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:

File	Description	Size	Format
Thesis_FS.pdf	PDF Datei der Dissertation	16,3 MB	Adobe PDF	View/Open

Export: BibTex