Abstract:
Bayesian networks are often used to represent causal or data generating mechanisms.
There has been significant progress in the development of algorithms for learning the
structure of a Bayesian network from complete data and optional background knowledge.
However, the problem of learning the DAG part of a Bayesian network with latent
(unmeasured) variables is more difficult since the number of possible models is potentially
infinite.
LV models also have a number of other features which make search harder: LV models may be
overparametrized, e.g. containing edges or nodes that are redundant, and as a consequence the parameters
may be underidentified, leading to multimodal or flat likelihood surfaces; further, as described by Geiger
and Meek (1998), LV models are stratified exponential families, rather than curved exponential families
(like DAGs without LVs), and consequently the results which guarantee the asymptotic consistency of
scores such as BIC do not apply.
This presents a dilemma: on the one hand, attempting to search for causal structure without allowing for
the possibility of latent or missing variables is substantively unreasonable in many contexts, but
explicitly including latent variables makes the search space intractable, and introduces models with
features that make model selection difficult. To address this Spirtes and Richardson (1998) have
introduced a class of graphical Gaussian models, called MAG models, which do not include latent variables,
but do impose the independence constraints given by latent variable models, and only these constraints. In
contrast to latent variable models, Gaussian MAG models are efficiently parametrized, always statistically
identifiable, and have a well-defined dimension (they form curved exponential families).
Back to Talks Page