Databases Reference
In-Depth Information
5 Privacy-Preservation of Application Results
In many cases, the output of applications can be used by an adversary in order
to make significant inferences about the behavior of the underlying data. In
this section, we will discuss a number of miscellaneous methods for privacy-
preserving data mining which tend to preserve the privacy of the end results of
applications such as association rule mining and query processing. This prob-
lem is related to that of disclosure control [1] in statistical databases, though
advances in data mining methods provide increasingly sophisticated meth-
ods for adversaries to make inferences about the behavior of the underlying
data. In cases, where the commercial data needs to be shared, the associa-
tion rules may represent sensitive information for target-marketing purposes,
which needs to be protected from inference.
In this section, we will discuss the issue of disclosure control for a num-
ber of applications such as association rule mining, classification, and query
processing. The key goal here is to prevent adversaries from making infer-
ences from the end results of data mining and management applications. A
broad discussion of the security and privacy implications of data mining are
presented in [30]. We will discuss each of the applications below:
5.1 Association Rule Hiding
Recent years have seen tremendous advances in the ability to perform as-
sociation rule mining effectively. Such rules often encode important target
marketing information about a business. Some of the earliest work on the
challenges of association rule mining for database security may be found in
[14]. Two broad approaches are used for association rule hiding:
Distortion: In distortion [89], the entry for a given transaction is modified
to a different value. Since, we are typically dealing with binary transac-
tional data sets, the entry value is flipped.
Blocking: In blocking [96], the entry is not modified, but is left incom-
plete. Thus, unknown entry values are used to prevent discovery of asso-
ciation rules.
We note that both the distortion and blocking processes have a number of side
effects on the non-sensitive rules in the data. Some of the non-sensitive rules
may be lost along with sensitive rules, and new ghost rules may be created
because of the distortion or blocking process. Such side effects are undesirable
since they reduce the utility of the data for mining purposes.
A formal proof of the NP-hardness of the distortion method for hiding
association rule mining may be found in [14]. In [14], techniques are proposed
for changing some of the 1-values to 0-values so that the support of the corre-
sponding sensitive rules is appropriately lowered. The utility of the approach
was defined by the number of non-sensitive rules whose support was also low-
ered by using such an approach. This approach was extended in [31] in which
Search WWH ::




Custom Search