bioinfo-statistics
250219 When causal inference meets deep learning Nat. Mach. Intell. (2020) 본문
250219 When causal inference meets deep learning Nat. Mach. Intell. (2020)
spnz3 2025. 2. 19. 18:18https://www.nature.com/articles/s42256-020-0218-x
Bayesian networks can capture causal relations, but learning such a network from data is NP-hard. Recent work has made it possible to approximate this problem as a continuous optimization task that can be solved efficiently with well-established numerical techniques.
기존 연구의 한계, 그것을 극복하는 최근 연구들
... One of the most popular causal models, known as Bayesian networks (BN)1, encodes the conditional independencies between variables using directed acyclic graphs (DAGs). BN models have been useful for addressing important challenges in several machine learning directions, such as interpretable learning2 and fairness-aware learning3.
Exact inference of the DAG structure of BN is computationally intractable, due to the combinatorial explosion in the search space. Most widely used algorithms are approximating a solution by using constraint4- or score-based5 heuristics. A constraint-based method can use background knowledge to infer the direction of causality between variables using statistical tests while a score-based method directs its search through the space of all possible DAGs using a score-function as a heuristic.
While these methods can circumvent the enormous search space, they are still computationally costly due to their combinatorial nature and the inefficient search strategies. Building on a recent framework, which established an elegant mathematical connection between the classical combinatorial optimization of searching the space of possible DAG solutions and the continuous optimization of machine learning6, Lachapelle et al.7 describe how the statistical problem can be turned into a pure neural network learning one while Zheng et al.8 transform it into a general, non-parametric problem that requires no modelling assumptions.
...<위 연구들에 대한 자세한 설명이 있는데 이해가 안가서 생략>...

위 연구들의 중요성
The hardness of causal inference mainly stems from the intractable combinatorial search space. With the recently developed methods, new types of global solutions are now available that can be scaled to larger sizes of problems where combinatorial algorithms are not. For example, recently it has been shown that these new methods can be scaled to infer causal transcriptome networks with more than 10,000 genes11, which cannot be done by combinatorial algorithms in a reasonable timeframe. In addition, it can be expected that prior knowledge would further enhance the utility of this paradigm for causal inference. For instance, it is easy to incorporate domain knowledge such as ‘event X never leads to event Y’ or ‘gene A enhances the expression of gene B in most of the tissue types’. From a deep learning perspective, it has been frequently observed that it is hard to extract any explicit structures of the data from the deep neural networks to facilitate the interpretation. Such models, although accurate, provide no meaningful insights about how their decisions are made. Recent studies12,13 indicated that a certain degree of interpretability could be achieved by properly designing the architecture of the neural network based on domain knowledge. Lachapelle et al. and Zheng et al.8 directly inferred part of the neural network architecture from data to encode the causality among random variables, which provides model interpretation from another perspective14. Currently, most of the neural architecture search (NAS) algorithms optimize the neuron connections with respect to prediction accuracy. The models discussed here provide us with an unprecedented opportunity to develop a new generation of NAS frameworks that could achieve both model accuracy and transparency simultaneously. That is, part of the architecture could be optimized as a regularization as normal NAS frameworks to improve the model accuracy and another part of the architecture could be optimized with respect to various types of interpretations, such as causality or feature modularity.
