Causal Discovery with General Nonlinear Data
Generalized Score Functions for Causal Discovery
Current score-based methods usually need strong assumptions on functional forms of causal mechanisms, as well as on data distributions. In this work, we introduce generalized score functions for causal discovery based on the characterization of general (conditional) independence relationships between random variables. The resulting causal discovery approach produces asymptotically correct results in rather general cases, which may have nonlinear causal mechanisms, a wide class of data distributions, mixed continuous and discrete data, and multidimensional variables. [paper][ code ][poster]
Optimal Kernel Choice for Score Function-based Causal Discovery
The above work proposed a generalized score function that can handle general data distributions and causal relationships by modeling the relations in reproducing kernel Hilbert space (RKHS). The selection of an appropriate kernel within this score function is crucial for accurately characterizing causal relationships and ensuring precise causal discovery. However, the current method involves manual heuristic selection of kernel parameters, making the process tedious and less likely to ensure optimality. In this paper, we propose a kernel selection method within the generalized score function that automatically selects the optimal kernel that best fits the data. Specifically, we model the generative process of the variables involved in each step of the causal graph search procedure as a mixture of independent noise variables. Based on this model, we derive an automatic kernel selection method by maximizing the marginal likelihood of the variables involved in each search step. [pdf]
Causal Discovery with Mixed Linear and Nonlinear Additive Noise Models: A Scalable Approach.
Real-world scenarios often involve causal mechanisms with a mixture of linear and nonlinear characteristics, which has received limited attention in existing literature. Due to unidentifiability, existing algorithms relying on fully identifiable conditions may produce erroneous results. Although traditional methods like the PC algorithm can be employed to uncover such graphs, they typically yield only a Markov equivalence class. This paper introduces a novel causal discovery approach that extends beyond the Markov equivalence class, aiming to uncover as many edge directions as possible when the causal graph is not fully identifiable. Our approach exploits the second derivative of the log-likelihood in observational data, harnessing scalable machine learning approaches to approximate the score function. Overall, our approach demonstrates competitive accuracy comparable to current state-of-the-art techniques while offering a significant improvement in computational speed. [pdf]