Database Reference
In-Depth Information
while special care is taken to ensure the validity of the transactions in the extended
part.
A two-phase iterative process that improves the functionality of the inline approach
was proposed by Gkoulalas-Divanis and Verykios in [ 23 ]. The process consists of two
phases that are executed in an iterative fashion until either (i) an exact solution of the
given problem instance is found, or (ii) a pre-specified number of phase iterations
(called oscillations ) have taken place. In the first phase, the hiding algorithm
uses the inline approach in an effort to conceal the sensitive knowledge without
side-effects. If it succeeds, then the process terminates. Otherwise, the algorithm
proceeds to the second phase, which implements the dual counterpart of the inline
algorithm. In this phase, the hiding algorithm selectively removes inequalities from
the infeasible CSP, until the CSP becomes feasible, and then solves the CSP to attain
the sanitized dataset. This dataset is bound to suffer from side-effects (due to the
removal of constraints) and the purpose of the second phase is to recover the lost
itemsets by increasing their support and making them frequent again.
3.4
Metrics and Performance Analysis
In this section, we present two categories of measures related to the performance
of an association rule hiding algorithm. The first category consists of measures that
can either be optimized by a hiding scheme in the course of its execution, or be
adopted to allow for a fair comparison among different hiding schemes under a
unified framework. The measures belonging in this category are called internal and
were proposed by Oliveira et al. [ 41 ]. They are classified as either data sharing -based
or pattern sharing -based. The data sharing-based measures quantify the extent of
side-effects regarding sensitive association rules that failed to be hidden, legitimate
rules that were accidentally missed, and artifactual association rules that were created
by the sanitization process. On the other hand, the pattern sharing-based measures
quantify the extent of side-effects regarding non-sensitive association rules that were
lost or sensitive rules that were improperly hidden and can be easily be recovered
through the use of inference channels. Furthermore, we proceed to present another set
of metrics, which measure external parameters such as the behavior of the algorithm
when applied to large datasets, its computational speed, and so on and so forth. The
measures of this category are called external and were proposed by Bertino et al. [ 12 ].
The proposed data-sharing based measures are the following:
(a) Hiding Failure (HF). This measure quantifies the percentage of the sensitive
patterns that remain exposed in the sanitized dataset. It is defined as the fraction
of the restrictive association rules that appear in the sanitized database divided
by the ones that appeared in the original dataset. Formally,
R P ( U )
= |
|
HF
|
R P ( U )
|
Search WWH ::




Custom Search