Information-Theoretic Causal Discovery

Marx, Alexander

Please use this identifier to cite or link to this item: doi:10.22028/D291-34290

Title:	Information-Theoretic Causal Discovery
Author(s):	Marx, Alexander
Language:	English
Year of Publication:	2021
DDC notations:	500 Science 600 Technology 620 Engineering and machine engineering
Publikation type:	Dissertation
Abstract:	It is well-known that correlation does not equal causation, but how can we infer causal relations from data? Causal discovery tries to answer precisely this question by rigorously analyzing under which assumptions it is feasible to infer causal networks from passively collected, so-called observational data. Particularly, causal discovery aims to infer a directed graph among a set of observed random variables under assumptions which are as realistic as possible. A key assumption in causal discovery is faithfulness. That is, we assume that separations in the true graph imply independencies in the distribution and vice versa. If faithfulness holds and we have access to a perfect independence oracle, traditional causal discovery approaches can infer the Markov equivalence class of the true causal graph---i.e., infer the correct undirected network and even some of the edge directions. In a real-world setting, faithfulness may be violated, however, and neither do we have access to such an independence oracle. Beyond that, we are interested in inferring the complete DAG structure and not just the Markov equivalence class. To circumvent or at least alleviate these limitations, we take an information-theoretic approach. In the first part of this thesis, we consider violations of faithfulness that can be induced by exclusive or relations or cancelling paths, and develop a weaker faithfulness assumption, called 2-adjacency faithfulness, to detect some of these mechanisms. Further, we analyze under which conditions it is possible to infer the correct DAG structure even if such violations occur. In the second part, we focus on independence testing via conditional mutual information (CMI). CMI is an information-theoretic measure of dependence based on Shannon entropy. We first suggest estimating CMI for discrete variables via normalized maximum likelihood instead of the plug-in maximum likelihood estimator that tends to overestimate dependencies. On top of that, we show that CMI can be consistently estimated for discrete-continuous mixture random variables by simply discretizing the continuous parts of each variable. Last, we consider the problem of distinguishing the two Markov equivalent graphs X to Y and Y to X, which is a necessary step towards discovering all edge directions. To solve this problem, it is inevitable to make assumptions about the generating mechanism. We build upon the idea which states that the cause is algorithmically independent of its mechanism. We propose two methods to approximate this postulate via the Minimum Description Length (MDL) principle: one for univariate numeric data and one for multivariate mixed-type data. Finally, we combine insights from our MDL-based approach and regression-based methods with strong guarantees and show we can identify cause and effect via L0-regularized regression.
Link to this record:	urn:nbn:de:bsz:291--ds-342908 hdl:20.500.11880/31480 http://dx.doi.org/10.22028/D291-34290
Advisor:	Vreeken, Jilles
Date of oral examination:	29-Jun-2021
Date of registration:	6-Jul-2021
Faculty:	MI - Fakultät für Mathematik und Informatik
Department:	MI - Informatik
Professorship:	MI - Keiner Professur zugeordnet
Collections:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:

File	Description	Size	Format
dissertation-marx.pdf		2,33 MB	Adobe PDF	View/Open

Export: BibTex

This item is licensed under a Creative Commons License