Databases Reference
In-Depth Information
amount of research has been devoted to increasing the number of queries
which can be answered by the auditor without compromising privacy.
In [63], the problem of sum auditing on sub-cubes of the data cube are
studied, where a query expression is constructed using a string of 0, 1, and
*. The elements to be summed up are determined by using matches to the
query string pattern. In [67], the problem of auditing a database of boolean
values is studied for the case of sum and max queries. In [18], and approach for
query auditing is discussed which is actually a combination of the approach
of denying some queries and modifying queries in order to achieve privacy.
In [64], the authors show that denials to queries depending upon the answer
to the current query can leak information. The authors introduce the notion
of simulatable auditing for auditing sum and max queries. In [84], the authors
devise methods for auditing max queries and bags of max and min queries
under the partial and full disclosure settings. The authors also examine the
notion of utility in the context of auditing, and obtain results for sum queries
in the full disclosure setting.
A number of techniques have also been proposed for the oine version of
the auditing problem. In [26], a number of variations of the oine auditing
problem have been studied. In the oine auditing problem, we are given a
sequence of queries which have been truthfully answered, and we need to deter-
mine if privacy has been breached. In [26], effective algorithms were proposed
for the sum, max, and max and min versions of the problems. On the other
hand, the sum and max version of the problem was shown to be NP-hard.
In [4], an oine auditing framework was proposed for determining whether
a database adheres to its disclosure properties. The key idea is to create an
audit expression which specifies sensitive table entries.
A number of techniques have also been proposed for sanitizing or random-
izing the data for query auditing purposes. These are fairly general models
of privacy, since they preserve the privacy of the data even when the en-
tire database is available. The standard methods for perturbation [2, 5] or
k -anonymity [98] can always be used, and it is always guaranteed that an
adversary may not derive anything more from the queries than they can from
the base data. Thus, since a k -anonymity model guarantees a certain level of
privacy even when the entire database is made available, it will continue to
do so under any sequence of queries. In [23], a number of interesting methods
are discussed for measuring the effectiveness of sanitization schemes in terms
of balancing privacy and utility.
Instead of sanitizing the base data, it is possible to use summary constructs
on the data, and respond to queries using only the information encoded in
the summary constructs. Such an approach preserves privacy, as long as the
summary constructs do not reveal sensitive information about the underly-
ing records. A histogram based approach to data sanitization has been dis-
cussed in [23, 24]. In this technique the data is recursively partitioned into
multi-dimensional cells. The final output is the exact description of the cuts
along with the population of each cell. Clearly, this kind of description can
Search WWH ::




Custom Search