Databases Reference
In-Depth Information
2.2 Related Work
Although the need for security and privacy in data warehouses and OLAP sys-
tems has long been identified [5, 27, 28], today's commercial OLAP products
usually provide insucient security measures [27]. In contrast, access control
is mature in relational databases. In relational databases, accesses to sensitive
data are regulated based on various models. The discretional access control
(DAC) uses owner-specified grants and revokes to achieve an owner-centric
control of objects [16]. The role-based access control (RBAC) simplifies ac-
cess control tasks by introducing an intermediate tier of roles that aggregates
and bridges users and permissions [25]. The flexible access control framework
(FAF) provides a universal solution to handling conflicts in access control poli-
cies through authorization derivation and conflict resolution logic rules [20].
Inference control has been studied in statistical databases and census data
for more than thirty years [1, 12, 35]. The proposed methods can roughly be
classified into restriction-based techniques and perturbation-based techniques.
Restriction-based inference control methods prevent malicious inferences by
denying unsafe queries. Those methods determine the safety of queries based
on the minimal number of values aggregated by a query [12], the maximal num-
ber of common values aggregated by different queries [13], and the maximal
rank of a matrix representing answered queries [8]. The perturbation-based
techniques prevent inference by inserting random noises to sensitive data [30],
to answers of queries [4], or to database structures [26].
Cell suppression and partitioning most closely relate to the methods we
shall introduce. To protect census data released in statistical tables, cells that
contains small COUNT values are suppressed, and possible inferences of the
suppressed cells are then detected and removed using linear programming-
based techniques. The detection method is effective for two-dimensional cases
but becomes intractable for three or more dimensional tables [10, 11]. Par-
titioning defines a partition on sensitive data and restricts queries to aggre-
gate only complete blocks in the partition [7, 37]. Similarly, microaggregation
replaces clusters of sensitive values with their averages [21, 35]. Partitioning
and microaggregation methods usually assume a specific type of aggregations.
Moreover, their partitions are not based on dimension hierarchies inherent to
data and hence may contain many blocks that are meaningless to a user.
Perturbation-based methods have been proposed for preserving privacy
in data mining [2]. Random noises are added to sensitive values to preserve
privacy, while the statistical distribution is approximately reconstructed from
the perturbed data to facilitate data mining tasks. Protecting sensitive data in
OLAP is different from that in data mining. Unlike most data mining results,
such as classifications and association rules, the results of OLAP usually can-
not be obtained from distribution models alone. The methods proposed in [3]
can approximately reconstruct COUNTs from perturbed data with statisti-
cally bound errors, so OLAP tasks like classification can be fulfilled. However,
potential errors in individual values may prevent an OLAP user from gaining
Search WWH ::




Custom Search