Database Reference
In-Depth Information
Graphical Models
Probabilistic Databases
Probabilistic
Complex
Simple
model
(correlations given by a graph)
(disjoint-independent tuples)
Query
Simple
Complex
(e.g., P(X 1 X 3 | X 2 X 5 X 7 ) )
(e.g.,
x. y. z.R(x,y) S(x,z) )
Network
Static
Dynamic
(Bayesian or Markov Network)
(database
+
query)
Complexity
measured in
Network
Database
size of
Complexity
Tree-width
Query
parameter
System
Stand-alone
Extension to Relational DBMS
Figure 1.4: Comparison between Graphical Models and Probabilistic Databases.
1.2.7 PROBABILISTIC DATABASES V.S. GRAPHICAL MODELS
A graphical model (GM) is a concise way to represent a joint probability distribution over a large
set of random variables X 1 ,X 2 ,...,X n . The “graph” has one node for each random variable X i and
an edge (X i ,X j ) between all pairs of variables that are correlated in the probability space obtained
by fixing the values of all the other variables 5 . GMs have been extensively studied in knowledge
representation and machine learning since they offer concise ways to represent complex probability
distributions.
Any probabilistic database is a particular type of a GM, where each random variable is associ-
ated to a tuple (or to an attribute value, depending on whether we model tuple-level or attribute-level
uncertainty). Query answers can also be represented as a GM, by creating new random variables
corresponding to the tuples of all intermediate results, including one variable for every answer to
the query. Thus, GMs can be used both to represent probabilistic databases that have non-trivial
correlations between their tuples and to compute the probabilities of all query answers. However,
there are some significant distinctions between the assumptions made in GMs and in probabilistic
databases, which are summarized in Figure 1.4 , and are discussed next.
First, the probabilistic model in probabilistic databases is simple and usually (but not always)
consists of a collection of independent, or disjoint-independent tuples; we discuss in Chapter 2
how this simple model can be used as a building block for more complex probabilistic models.
In contrast, the probabilistic model in GMs is complex: they were designed explicitly to represent
5 This definition is sufficient for our brief discussion but is an oversimplification; we refer the reader to a standard textbook on
graphical models, e.g., [ Koller and Friedman , 2009 ].
 
Search WWH ::




Custom Search