Database Reference
In-Depth Information
Social Security Number:
Form #351:
Name:
Marital Status:
(1) single
(2) married
(3) divorced
(4) widowed
Social Security Number:
Form #352:
Name:
Marital Status:
(1) single
(2) married
(3) divorced
(4) widowed
Figure 2.1: Two completed survey forms.
For another example, consider the following datalog program, which computes all paths of
lengths 2 or 3 in a graph given by the binary relation R(x,y) :
S(x,y) :− R(x,z),R(z,y)
Q(x, y)
S(x,y)
Q(x, y) :− R(x,z),S(z,y)
:−
Written as a relational formula it becomes:
Q
={
(x, y)
|∃
z.(R(x, z)
R(z, y))
∨∃
z 1 .
z 2 .(R(x, z 1 )
R(z 1 ,z 2 )
R(z 2 ,y))
}
or, still equivalently, as:
Q ={ (x, y) | R(x,z),R(z,y) R(x, z 1 ), R(z 1 ,z 2 ), R(z 2 ,y) }
2.2 THE PROBABILISTIC DATA MODEL
Consider a census scenario in which a large number of individuals manually fill in forms. The
data in these forms subsequently has to be put into a database, but no matter whether this is done
automatically using OCR or by hand, some uncertainty may remain about the correct values for some
of the answers. Figure 2.1 shows two simple filled-in forms. Each one contains the social security
number, name, and marital status of one person.
The first person, Smith, seems to have checked marital status “single” after first mistakenly
checking “married”, but it could also be the opposite. The second person, Brown, did not answer
the marital status question. The social security numbers also have several possible readings. Smith's
could be 185 or 785 (depending on whether Smith originally is from the US or from Europe), and
Brown's may either be 185 or 186. In total, we have 2
32 possible readings of the two
census forms, which can be obtained by choosing one possible reading for each of the fields.
In an SQL database, uncertainty can be managed using null values. Our census data could be
represented as in the following table.
·
2
·
2
·
4
=
 
Search WWH ::




Custom Search