Research

Research overview:

Finding causal relationships from observational data is a fundamental problem across science, engineering, and machine learning. I have been working on automated causal discovery (i.e., causal network learning) in complex environments with theoretical guarantees. In particular, I, together with my collaborators, are exploring causal discovery in the presence of distribution shifts, selection bias, hidden confounders, nonlinear causal mechanisms, etc. We aim to develop a unified framework for causal discovery that is reliable and generally applicable in real-world scenarios.

Besides methodological developments in the field of causal discovery, I am also actively exploring its practical use. We have applied causal discovery algorithms to neuroscience, biology, healthcare, finance, and archeology. For example, we have developed methods to identify atypical brain information flow in autistics, with which to distinguish autistics from typical controls.

In addition, we have been endeavoring to answer: how does causal understanding contribute to artificial general intelligence? The causal view provides a clear picture for understanding advanced learning problems and allowing going beyond the data in a principled, interpretable manner. More specifically, I have been trying to answer how causal understanding facilitates solving fundamental machine learning problems, including classification, clustering, forecasting in nonstationary environments, reinforcement learning, transfer learning, and representation learning, especially when there exist distribution shifts or hidden variables leading to spurious associations.

Research highlights:

Causal Discovery from Nonstationary/Heterogeneous Data

Biwei Huang, Kun Zhang, Jiji Zhang, Joseph Ramsey, Bernhard Schölkopf, Clark Glymour

It is commonplace to encounter heterogeneous or nonstationary data, of which the underlying generating process changes across domains or over time. Such a distribution shift feature presents both challenges and opportunities for causal discovery. In this work, we develop a framework for causal discovery from such data that can efficiently locate variables with changing causal mechanisms, reliably recover the causal structure, and extract a low-dimensional representation of changes to visualize how the causal mechanism changes over time. Moreover, by making use of the independent change property between causal modules, with invariance as a special case, we make it explicit and precise how distribution shifts benefit causal discovery. [paper][ code ]

                                                    Screen Shot 2019-08-03 at 4.17.18 PM

AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning

Biwei Huang, Fan Feng, Chaochao Lu, Sara Magliacane, Kun Zhang

This work is on generalizing the learned policies to new environments in an efficient and effective way. Usually, distribution shifts are due to the changes of only a few mechanisms in the generative process, so we can just adapt the distribution corresponding to the generating process of a small portion of the variables. To this end, we leverage a graphical representation that characterizes structural relationships over variables in the RL system. Such graphical representations provide a compact way to encode where the changes are and inform us with a minimal set of changes that one needs to consider for policy adaptation, so that we can efficiently adapt the policy to the target domain, where only a few samples are needed and further policy optimization is avoided. [paper]

AdaRL

Generalized Independent Noise Condition for Estimating Latent Variable Causal Graphs

Feng Xie, Ruichu Cai, Biwei Huang, Clark Glymour, Zhifeng Hao, Kun Zhang

Most existing methods focus on causal relations between observed variables, while in many scenarios the observed ones may not be the underlying causal variables (e.g., image pixels), but are generated by latent causal variables or confounders that are causally related. To this end, in this paper, we consider Linear, Non-Gaussian Latent variable Models, in which latent confounders are also causally related, and propose a Generalized Independent Noise condition to locate latent variables and identify their causal structure, even when every pair of observed variables have multiple latent parents. [pdf]

GIN

Action-Sufficient State Representation Learning for Control with Structural Constraints

Biwei Huang*, Chaochao Lu*, Liu Leqi, José Miguel Hernández-Lobato, Clark Glymour, Bernhard Schölkopf, Kun Zhang

Perceived signals in real-world scenarios are usually high-dimensional and noisy, and finding and using their representation that contains essential and sufficient information required by downstream decision-making tasks will help improve computational efficiency and generalization ability in the tasks. In this work, we focus on extracting a minimal set of state representations that capture the information sufficient for decision-making, and thus, the policy function only relies on a set of low-dimensional state representations, improving both model and sample efficiency. [pdf]

Causal Discovery and Forecasting in Nonstationary Environments with State-Space Models

Biwei Huang, Kun Zhang, Mingming Gong, Clark Glymour

Given nonstationary processes where the causal relations may change over time, how can we discover the time-varying causal relationships, and meanwhile predict future values of variables of interest in a principled way? In this work, we show that causal discovery and forecasting for nonstationary processes can be put under the same umbrella. Particularly, by exploiting a particular type of state-space model that represents the processes, we find that nonstationarity actually helps to identify causal structure and that forecasting naturally benefits from the learned causal representation. Moreover, given the causal model, we can directly treat forecasting as a problem in Bayesian inference that exploits the time-varying property of the data and adapts to new observations in a principled manner. [paper] [ code ][poster]

Specific and Shared Causal Relation Modeling and Mechanism-Based Clustering

Biwei Huang, Kun Zhang, Pengtao Xie, Mingming Gong, Eric Xing, Clark Glymour

In this work, we develop a unified framework for causal discovery and mechanism-based group identification. In particular, we propose a specific and shared causal model (SSCM), which takes into account the variabilities of causal relations across individuals/groups and leverages their commonalities to achieve statistically reliable estimation. The learned SSCM gives the specific causal knowledge for each individual as well as the general trend over the population. Moreover, the estimated model directly provides the group information of each individual. [paper] [ code ][poster]

Generalized Score Functions for Causal Discovery

Biwei Huang, Kun Zhang, Yizhu Lin, Bernhard Schölkopf, Clark Glymour

Current score-based methods usually need strong assumptions on functional forms of causal mechanisms, as well as on data distributions. In this work, we introduce generalized score functions for causal discovery based on the characterization of general (conditional) independence relationships between random variables. The resulting causal discovery approach produces asymptotically correct results in rather general cases, which may have nonlinear causal mechanisms, a wide class of data distributions, mixed continuous and discrete data, and multidimensional variables. [paper][ code ][poster]

Multi-Domain Causal Structure Learning in Linear Systems

AmirEmad Ghassami, Negar Kiyavash, Biwei Huang, Kun Zhang

In this work, we study the problem of causal structure learning in linear systems from observational data given in multiple domains, across which the causal coefficients and/or the distribution of the exogenous noises may vary. The main tool used in our approach is the principle that in a causally sufficient system, the causal modules, as well as their included parameters, change independently across domains. [paper]

Identification of Time-Dependent Causal Model: A Gaussian Process Treatment

Biwei Huang, Kun Zhang, Bernhard Schölkopf

In this work, we present a novel approach to modeling time-dependent causal influences. We show that by introducing time information as a common cause for the observed processes, we can model the time-varying causal influences between the observed processes, as well as the influence from a certain type of unobserved confounders. We propose a principled way for the estimation by extending Gaussian Process regression, which enables an automatic way to learn how the causal model changes over time. [paper][ code ][poster]

On the Identifiability and Estimation of Functional Causal Models in the Presence of Outcome-Dependent Selection

Kun Zhang, Jiji Zhang, Biwei Huang, Bernhard Schölkopf, Clark Glymour

In this work, we study the identifiability and estimation of functional causal models under selection bias, with a focus on the situation where the selection depends solely on the effect variable. We address two questions of identifiability: the identifiability of the causal direction between two variables in the presence of selection bias, and, given the causal direction, the identifiability of the model with outcome-dependent selection. We also propose two methods for estimating an additive noise model from data that are generated with outcome-dependent selection. [paper]