Privacy Issues in Association Rule Mining - Frequent Pattern Mining

Database Reference

In-Depth Information

3 Support Count Exchange : LF i ( k ) are broadcast, and each site computes the local

support for the items in

∪ i LF i ( k ) .

4 Broadcast Mining Results : Each site broadcasts the local support for itemsets in

∪ i LF i ( k ) . From this, each site is able to compute F ( k ) .

Privacy-preserving Distributed ARM In the privacy-preserving version of the

FDM algorithm, we desire that information disclosure is limited. Specifically, no

site should be able to learn the contents of transactions belonging to any other site,

what rules are supported by any other site, or the specific value of support/confidence

for any rule at any other site, unless that information is revealed by knowledge of

one's own data and the final result (e.g., if a rule is supported globally but not at

one's own site, we can deduce that at least one other site supports the rule.) In this

basic version of the protocol, no collusion is assumed.

The method described in [ 26 ] follows the FDM algorithm given above, with

special protocols for replacing the broadcasts of LF i ( k ) and the support count of

items in LF ( k ) . In the FDM algorithm, step 3 reveals the frequent itemsets supported

by each site. To accomplish this without revealing what each site supports, we may

instead exchange locally frequent itemsets in a way that obscures the source of each

itemset. The main idea is that each site encrypts the locally supported itemsets, along

with enough “fake” itemsets to hide the actual number supported. This is achieved

by using secure union protocols (e.g., see [ 26 ]). Using different cryptographic tools

such as homomorphic encryption , such protocols can compute the union of sets

belonging to different parties, securely. For example, in our context, secure union

protocols could be used to compute

∪ i LF i ( k ) , without revealing which sites supports

which itemsets and how many sites support a given itemset.

In some cases, some secure union protocols (e.g., the one given in [ 26 ]) may

disclose extra information for efficiency purposes. For example, if we deem leakage

of the number of commonly supported itemsets as acceptable, it can be proven that

the secure union protocol described in [ 26 ] is secure under certain cryptographic

definitions. Such proofs usually show that everything else seen during the protocol

can be simulated based on the leaked information and the final set union. This tech-

nique can be quite powerful for generating reasonably secure and efficient protocols.

A protocol that is proved not to reveal anything other than the required result and

information deemed not privacy-threatening could be sufficient for many practical

purposes. This approach is used to prove that the set union protocol given in [ 26 ]

reveals only the union of locally frequent itemsets and a clearly bounded set of

innocuous information.

Secure set union protocols give the full set of locally frequent itemsets LF ( k ) .We

still, however, need to determine which of these itemsets are supported globally. Step

4 of the FDM algorithm forces each site to reveal its own support count for every

itemset in LF ( k ) . All we need to know for each itemset C x

∈

LF ( k ) ,isif C x .sup

≥

s %

. The following observation allows us to reduce this to a comparison against

a sum of local values (the excess support at each site):

×|

i = 1 |

C x .sup

≥

∗|

∗

(

U i |

)

Frequent Pattern Mining

Search WWH ::

Custom Search

Home