Databases Reference
In-Depth Information
distribution for that tuple's altered value. The detection process then relies
on discovering this induced statistical bias.
The authors validate the solution both theoretically and experimentally
on real data (Wal-Mart sales). They demonstrate resilience to both alteration
and data loss attacks, for example being able to recover over 75% of the
watermark from under 20% of the data.
Solution Details. For illustration purposes, let there be a set of discrete at-
tributes
{
A, B
}
and a primary data key K , not necessarily discrete. Any
attribute X
can yield a value out of n X possibilities (e.g., city
names, airline names). Let the number of tuples in the database be N .Let
b ( x ) be the number of bits required for the accurate representation of value
x and msb ( x, b ) its most significant b bits. If b ( x ) <b , x is left-padded with
( b
∈{
A, B
}
b ( x )) zeroes to form a b -bit result. Similarly, lsb ( x, b ) is used to denote the
least significant b bits of x .Ifby wm denotes a watermark to be embedded,
of length
, wm [ i ] will then be the i -th bit of wm .Let set bit ( d, a, b )be
a function that returns value d with the bit position a set to the truth-value
of b . In any following mathematical expression let the symbol “&” signify a
bit-AND operation. Let T j ( X ) be the value of attribute X in tuple j .Let
{
|
wm
|
be the discrete potential values of attribute A . These are dis-
tinct and can be sorted (e.g., by ASCII value). Let f A ( a j ) be the normalized
(to 1 . 0) occurrence frequency of value a j in attribute A . f A ( a j ) models the
de-facto occurrence probability of value a j in attribute A .
The encoding algorithm (see Figure 11) starts by discovering a set of “fit”
tuples determined directly by the association between A and the primary
relation key K. These tuples are then considered for mark encoding.
a 1 , ..., a n A }
wm embed alt ( K , A , wm , k 1 , e ,ECC)
wm data
ECC.encode ( wm, wm.len )
idx
0
for ( j
j +1)
if ( H ( T j ( K ) ,k 1 )mod e =0) then
t ← set bit ( H ( T j ( K ) ,k 1 ) , 0 ,wm data [ idx ])
T j ( A ) ← a t
embedding map [ T j ( K )] ← idx
idx ← idx +1
return embedding map
1; j<N ; j
Fig. 11. Encoding Algorithm (alternative using embedding map shown)
Step One. A tuple T i is said to be “fit” for encoding iff H ( T i ( K ) ,k 1 )mod e =
0, where e is an adjustable encoding parameter determining the percentage of
considered tuples and k 1 is a secret max ( b ( N ) ,b ( A ))-bit key. In other words,
a tuple is considered “fit” if its primary key value satisfies a certain secret
 
Search WWH ::




Custom Search