How to Prevent Private Data from being Disclosed to a Malicious Attacker - Data Mining: Foundations and Practice

Databases Reference

In-Depth Information

participant in the computation than can be inferred from that participant's

input and output. The SMC problem literature was introduced by Yao [31].

It has been proved that for any polynomial function, there is a secure multi-

party computation solution [12]. The approach used is as follows: the function

F to be computed is firstly represented as a combinatorial circuit, and then

the parties run a short protocol for every gate in the circuit. Every participant

gets corresponding shares of the input wires and the output wires for every

gate. This approach, though appealing in its generality and simplicity, is highly

impractical for large datasets.

2.2 Privacy-Preserving Data Mining

In early work on privacy-preserving data mining, Lindell and Pinkas [17] pro-

pose a solution to privacy-preserving classification problem using oblivious

transfer protocol, a powerful tool developed by secure multi-party computa-

tion (SMC) research. The techniques based on SMC for e ciently dealing with

large data sets have been addressed in [14], where a solution to the association

rule mining problem for the case of two parties was proposed.

Randomization approaches were firstly proposed by Agrawal and Srikant

in [3] to solve privacy-preserving data mining problem. In addition to pertur-

bation, aggregation of data values [26] provides another alternative to mask

the actual data values. In [1], authors studied the problem of computing the

k th-ranked element. Dwork and Nissim [9] showed how to learn certain types

of boolean functions from statistical databases in terms of a measure of prob-

ability difference with respect to probabilistic implication, where data are

perturbed with noise for the release of statistics. In this paper, we focus on

privacy-preserving among the inter-party computation.

Homomorphic encryption [21], which transforms multiplication of en-

crypted plaintexts into the encryption of the sum of the plaintexts, has re-

cently been used in secure multi-party computation. For instance, Freedmen,

Nissim and Pinkas [10] applied it for set intersection. For computing set inter-

section, unlike [2,10,27] proposed an approach based on commutative encryp-

tion. The work most related to ours is [30], where Wright and Yang applied

homomorphic encryption [21] to the Bayesian networks induction for the case

of two parties. The work that are closely related ours is [33], where Zhan et al.

present secure protocols for learning support vector machine over vertically

partitioned data. In this paper, we develop a secure protocol, based on homo-

morphic encryption and random perturbation techniques, for multiple parties

to build SVMs over horizontally partitioned data without compromising their

data privacy.

3 Learning SVMs on Private Data

Support vector machines were invented by Vapnik [29] in 1982. The idea con-

sists of mapping the space of input examples into a high-dimensional feature

space, so that the optimal separating hyperplane built on this space allow

Search WWH ::

Custom Search

Home