Database Reference
In-Depth Information
n
n
i = 1 |
C x .sup i
s
(
U i |
)
i = 1
n
( C x .sup i
s
∗|
U i |
)
0
i = 1
Therefore, checking for support is equivalent to checking if i = 1 ( C x .sup i
s
|
. This is
accomplished by first computing the sum securely, and applying a secure comparison
at the end.
The first site generates a random number x r for each itemset C x , adds that number
to its ( C x .sup i
U i |
)
0. The challenge is to do this without revealing C x .sup i or
|
U i |
s
∗|
U i |
), and sends it to the next site. (All arithmetic is modm ,
where m
, for security purposes.) The random number masks the actual
excess support, so the second site learns nothing about the first site's actual database
size or support. The second site adds its excess support and sends the value on. The
random value now hides both support counts. The last site in the chain now has
2
∗|
U
|
i = 1 ( C x .sup i
x r (mod m ).
Since the total database size is
s
∗|
U i |
)
+
m/ 2, negative summation will be mapped
to some number that is bigger than (or equal to) m/ 2. (
|
U
|≤
k mod m .) The
last site needs to test if this sum minus x r modm is less than m/ 2. This can be done
securely using Yao's generic method [ 56 ]. Clearly this algorithm is secure as long
as there is no collusion, as no site can distinguish what it receives from a random
number. Alternatively, the first site can simply send x r to the last site. The last site
learns the actual excess support, but does not learn the support values for any single
site. In addition, if we consider the excess support to be a valid part of the global
result, this method is still secure.
The above basic protocol can be extended to provide privacy in the context
of collusions [ 26 ]. In addition, efficiency could be improved by using fast union
protocols [ 51 ].
k
=
m
4.2
Vertically Partitioned Data
The FDM algorithm that was described above can be modified to address the case of
vertically partitioned data. First of all, each site can compute all the locally frequent
itemsets that can be supported based on the items belonging to the local site. Later
on, to check whether an itemset is globally frequent, vertically partitioned data need
to be combined in order to compute the T U j = 1 I C x i j T , where I C x i j T is
the indicator function that represents whether transaction T
U contains itemset
C x i j or not.
In the context of two parties, the above equation becomes a simple dot product. For
example, suppose two parties wish to determine if an itemset C x has the minimum
support in the data set, but neither party has data on the entire set C x . Instead, they
Search WWH ::




Custom Search