We develop upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our motivation comes from the weighted model counting problem (or, equivalently, the problem of computing the probability of a Boolean function), which is #P-hard in general. By performing several dissociations, one can transform a Boolean formula whose probability is difficult to compute, into one whose probability is easy to compute, and which is guaranteed to provide an upper or lower bound on the probability of the original formula by choosing appropriate probabilities for the dissociated variables. Our new bounds shed light on the connection between previous relaxation-based and model-based approximations and unify them as concrete choices in a larger design space.

Dissociation


We also show how our theory allows a standard relational database management system (DBMS) to both upper and lower bound hard probabilistic queries in guaranteed polynomial time. We call this approach query dissociation and change from the commonly-used possible-world semantics to a new semantics that we call propagation. Conceptually, a dissociated query is obtained from an unsafe query by adding extraneous variables to some atoms until the query becomes safe. We show that the scores for chain queries before and after dissociation correspond to two well-known scoring functions on graphs, namely network reliability (which is #P-hard), and propagation (which is related to PageRank and in PTIME). We then define a unique propagation score for an answer to a self-join-free conjunctive query and prove that it is always an upper bound to the score given by the possible world semantics. We further show that the propagation score can always be evaluated with an instance-independent query plan, that the propagation score is identical to the possible world semantics for all safe queries, and that the concept of propagation is a natural generalization of plans for safe queries to plans for unsafe queries.

Query Dissociation


Publications

People & Affiliations

(Assistant Professor @ CMU)
(Professor @ UW)
Carnegie Mellon University, Tepper School of Business University of Washington, Database group

Funding

This project has been generously supported by the NSF via grants IIS-0915054 (the BeliefDB project) and IIS-1553547. Any opinions, findings, and conclusions or recommendations expressed in this project are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

National Science Foundation