Database Watermarking for Copyright Protection - Database Security: Applications and Trends

Databases Reference

In-Depth Information

distribution for that tuple's altered value. The detection process then relies

on discovering this induced statistical bias.

The authors validate the solution both theoretically and experimentally

on real data (Wal-Mart sales). They demonstrate resilience to both alteration

and data loss attacks, for example being able to recover over 75% of the

watermark from under 20% of the data.

Solution Details. For illustration purposes, let there be a set of discrete at-

tributes

{

A, B

}

and a primary data key K , not necessarily discrete. Any

attribute X

can yield a value out of n X possibilities (e.g., city

names, airline names). Let the number of tuples in the database be N .Let

b ( x ) be the number of bits required for the accurate representation of value

x and msb ( x, b ) its most significant b bits. If b ( x ) <b , x is left-padded with

( b

∈{

A, B

}

b ( x )) zeroes to form a b -bit result. Similarly, lsb ( x, b ) is used to denote the

least significant b bits of x .Ifby wm denotes a watermark to be embedded,

of length

−

, wm [ i ] will then be the i -th bit of wm .Let set bit ( d, a, b )be

a function that returns value d with the bit position a set to the truth-value

of b . In any following mathematical expression let the symbol “&” signify a

bit-AND operation. Let T j ( X ) be the value of attribute X in tuple j .Let

{

be the discrete potential values of attribute A . These are dis-

tinct and can be sorted (e.g., by ASCII value). Let f A ( a j ) be the normalized

(to 1 . 0) occurrence frequency of value a j in attribute A . f A ( a j ) models the

de-facto occurrence probability of value a j in attribute A .

The encoding algorithm (see Figure 11) starts by discovering a set of “fit”

tuples determined directly by the association between A and the primary

relation key K. These tuples are then considered for mark encoding.

a 1 , ..., a n A }

wm embed alt ( K , A , wm , k 1 , e ,ECC)

wm data

←

ECC.encode ( wm, wm.len )

idx

←

for ( j

j +1)

if ( H ( T j ( K ) ,k 1 )mod e =0) then

t ← set bit ( H ( T j ( K ) ,k 1 ) , 0 ,wm data [ idx ])

T j ( A ) ← a t

embedding map [ T j ( K )] ← idx

idx ← idx +1

return embedding map

←

1; j<N ; j

←

Fig. 11. Encoding Algorithm (alternative using embedding map shown)

Step One. A tuple T i is said to be “fit” for encoding iff H ( T i ( K ) ,k 1 )mod e =

0, where e is an adjustable encoding parameter determining the percentage of

considered tuples and k 1 is a secret max ( b ( N ) ,b ( A ))-bit key. In other words,

a tuple is considered “fit” if its primary key value satisfies a certain secret

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home