Database Reference
In-Depth Information
a)
Privacy Breach:
An upward
ρ
1
-to-
ρ
2
privacy breach exists with respect to
property
Q
if
∃
v
∈
S
V
such that
P
[
Q
(
U
i
)]
≤
ρ
1
and
P
[
Q
(
U
i
)
|
R
(
U
i
)
=
v
]
≥
ρ
2
.
Conversely, a downward
ρ
2
-to-
ρ
1
privacy breach exists with respect to property
Q
if
∃
v
∈
S
V
such that
P
[
Q
(
U
i
)]
≥
ρ
2
and
P
[
Q
(
U
i
)
|
R
(
U
i
)
=
v
]
≤
ρ
1
.
b)
Amplification:
Let the perturbed database be
V
, with domain
S
V
, and corresponding index set
I
V
. For example, given the sample database
U
discussed above, and assuming that each attribute is distorted to produce a value
within its original domain, the distortion may result in
={
V
1
,
...
,
V
N
}
V
5
7
2
12
V
Adult
Male
Elementary
which
maps
to
Adult
Female
Elementary
Child
Male
Graduate
Senior
Female
Graduate
Let the probability of an original customer record
U
i
=
u
,
u
∈
I
U
being per-
turbed to a record
V
i
=
v
,
v
∈
I
V
be
p
(
u
→
v
), and let
A
denote the matrix of
these transition probabilities, with
A
vu
=
p
(
u
→
v
). With the above notation,a
randomization operator
R
(
u
)
S
U
:
p
[
u
1
→
v
]
∀
u
1
,
u
2
∈
v
]
≤
γ
p
[
u
2
→
where
γ
v
]
>
0. Operator
R
(
u
) is at most
γ
-amplifying if
it is at most
γ
-amplifying for all qualifying
v
≥
1 and
∃
u
:
p
[
u
→
S
V
.
c)
Breach Prevention:
Let
R
be a randomization operator,
v
∈
∈
S
V
be a randomized
value such that
v
]
>
0, and
ρ
1
,
ρ
2
(0
<ρ
1
<ρ
2
<
1) be two
probabilities as per the above privacy breach definition. Then, if
R
is at most
γ
-amplifying for
v
, revealing “
R
(
u
)
∃
u
:
p
[
u
→
v
” will cause neither upward (
ρ
1
-to-
ρ
2
)
nor downward (
ρ
2
-to-
ρ
1
) privacy breaches with respect to any property if the
following condition is satisfied:
=
−
ρ
2
(1
ρ
1
)
ρ
2
)
>γ
ρ
1
(1
−
If this holds,
R
is said to support (
ρ
1
,
ρ
2
)-privacy guarantees.
Accuracy Metrics
Applying association rule mining on a perturbed database can
lead to two kinds of errors. Firstly, there may be
support
errors, where a correctly-
identified frequent itemset may be associated with an incorrect support value.
Secondly, there may be
identity
errors, wherein either a genuine frequent itemset