Databases Reference
In-Depth Information
It is only necessary to create an index for attributes involved in search and
join predicates. Without loss of generality, one can assume that an index is
created over each attribute of the relation.
Partition Functions : To explain what is stored in attribute A i of R S for each
attribute A i of R the following notations are useful. The domain of values (
D i )
of attribute R.A i are first mapped into partitions
, such that these
partitions taken together cover the whole domain. The function partition is
defined as follows:
{
p 1 ,...,p k }
partition ( R.A i )=
{
p 1 ,p 2 ,...,p k }
As an example, consider the attribute eid of the emp table above. Suppose
the values of domain of this attribute lie in the range [0 , 1000]. Assume that
the whole range is divided into 5 partitions, represented as:
partition ( emp.eid )=
{
[0 , 200] , (200 , 400] , (400 , 600] , (600 , 800] , (800 , 1000]
}
Different attributes may be partitioned using different partition functions,
or they might be partitioned together using a multidimensional model. The
partition of attribute A i corresponds to a splitting of its domain into a set
of buckets. The strategy used to split the domain into a set of buckets has
profound implications on both the e ciency of the resulting query processing
as well as on the disclosure risk of sensitive information to the server. For
now, to explain the query processing strategy, we will make a simplifying
assumption that the bucketization of the domain is based on the equi-width 4
partitioning (though the strategy developed will work for any partitioning of
the domain). We will revisit the eciency and disclosure risks in the following
subsections.
Identification Functions : An identification function called ident assigns a ran-
dom, unique identifier ident R.A i ( p j ) to each partition p j of attribute A i . Fig-
ure 2 shows the identifiers assigned to the 5 partitions of the attribute emp.eid .
For instance, ident emp.eid ([0 , 200]) = 2, and ident emp.eid ((800 , 1000]) = 4.
Fig. 2. Partition and identification functions of emp.eid
Mapping Functions : Given the above partition and identification functions, a
mapping function Map R.A i maps a value v in the domain of attribute A i to
the identifier of the partition to which v belongs: Map R.A i ( v )= ident R.A i ( p j ),
where p j is the partition that contains v . Later we describe a more general
approach where a value might be assigned to multiple buckets [30] (probabilis-
tically). This can be shown to achieve a greater degree of security than the
4 where the domain of each bucket has the same width
 
Search WWH ::




Custom Search