Privacy-Preserving Data Mining: A Survey - Database Security: Applications and Trends

Databases Reference

In-Depth Information

for rule based classifier downgradation is discussed in [86] in which it has been

shown how to effectively hide classification rules for a data set.

5.3 Query Auditing and Inference Control

Many sensitive databases are not available for public access, but may have

a public interface through which aggregate querying is allowed. This leads to

the natural danger that a smart adversary may pose a sequence of queries

through which he or she may infer sensitive facts about the data. The nature

of this inference may correspond to full disclosure , in which an adversary may

determine the exact values of the data attributes. A second notion is that

of partial disclosure in which the adversary may be able to narrow down the

values to a range, but may not be able to guess the exact value. Most work

on query auditing generally concentrates on the full disclosure setting.

Two broad approaches are designed in order to reduce the likelihood of

sensitive data discovery:

•

Query Auditing: In query auditing, we deny one or more queries from

a sequence of queries. The queries to be denied are chosen such that the

sensitivity of the underlying data is preserved. Some examples of query

auditing methods include [34, 64, 84, 94].

•

Query Inference Control: In this case, we perturb the underlying data

or the query result itself. The perturbation is engineered in such a way, so

as to preserve the privacy of the underlying data. Examples of methods

which use perturbation of the underlying data include [3, 23, 81]. Examples

of methods which perturb the query result include [19, 33, 39, 40, 41].

An overview of classical methods for query auditing may be found in [1].

The query auditing problem has an online version, in which we do not know

the sequence of queries in advance, and an o ine version, in which we do

know this sequence in advance. Clearly, the oine version is open to better

optimization from an auditing point of view.

The problem of query auditing was first studied in [34, 94]. This approach

works for the online version of the query auditing problem. In these works,

the sum query is studied, and privacy is protected by using restrictions on

sizes and pairwise overlaps of the allowable queries. Let us assume that the

query size is restricted to be at most k , and the number of common elements

in pairwise query sets is at most m . Then, if q be the number of elements that

the attacker already knows from background knowledge, it was shown that

[34, 94] that the maximum number of queries allowed is (2

( q + 1)) /m .

We note that if N be the total number of data elements, the above expression

is always bounded above by 2

·

k

−

N . If for some constant c , we choose k = N/c

and m = 1, the approach can only support a constant number of queries,

after which all queries would have to be denied by the auditor. Clearly, this

is undesirable from an application point of view. Therefore, a considerable

·

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home