Invariant prediction and causal inference

Jonas Peters, ETH Zürich

Why are we interested in the causal structure of a data-generating process? In a classical regression problem, for example, we include a variable into the model if it improves the prediction; it seems that no causal knowledge is required. In many situations, however, we are interested in the system's behavior under a change of environment. Here, causal models become important because they are usually considered invariant under those changes. A causal prediction (which uses only direct causes of the target variable as predictors) remains valid even if we intervene on predictor variables or change the whole experimental setting. In this talk, we propose to exploit invariant prediction for causal inference: given data from different experimental settings, we use invariant models to estimate the set of causal predictors. We provide valid confidence intervals and examine sufficient assumptions under which the true set of causal predictors becomes identifiable. The empirical properties are studied for various data sets, including gene perturbation experiments. This talk does not require any prior knowledge about causal concepts.