Databases Reference
In-Depth Information
tuples selected. Given η tuples in relation R , roughly η/γ values of attribute
A are selected for watermark insertion.
For each selected tuple r , exactly one bit is chosen from watermark infor-
mation wm data and is embedded to r.A , where the watermark information
wm data consists of roughly η/γ bits generated from a shorter watermark wm
using error correcting code (ECC). The bit position that is chosen is deter-
mined by mapping
S 2 uniformly to the range of wm data indexes (this can
be done by using a pseudo-random number generator or using an embedding
map as stated in [21]). To embed the chosen bit b , the current categorical
value r.A is changed to another valid value of A , which is chosen from a list
L A of all valid values of A . In this process, any value a canbechosenfrom L A
(to replace r.A )aslongas a 's index in L A has the least significant bit b . This
flexibility in value selection can be exploited to maintain certain distribution
properties of A in watermark insertion.
For watermark detection, a number of tuples are selected the same way
as in watermark insertion, based on
S 1 . Then, for each selected tuple r , a bit
position in wm data is located the same way as in watermark insertion, based
on
S 2 . The corresponding bit value in wm data is extracted from the least
significant bit of the index of r.A in the list L A . After all of the tuples are
processed, the ECC takes as input wm data and produces the correspond-
ing wm . The ECC can tolerate certain errors in detecting wm data and still
produce the same wm in watermark detection.
This watermarking scheme has been applied to binned medical data in a
hierachical manner so as to protect copyright in the presence of generalization
attack that is specific to the binned data [2].
3 Distortion
While the watermarking errors introduced to numerical values can be made
small, thus tolerable to certain data applications as illustrated in [1], the
errors introduced to categorical data can be significant, at least to individual
values. In [13], Li, Guo, and Jajodia introduced a distortion-free scheme for
watermarking categorical data (it can also be directly applied to watermarking
numerical data with no errors). In this solution, all η tuples in a database
relation R are first securely divided into g groups according to a secret key
.
A different watermark is embedded and verified in each group independently.
As a result, any modifications to the watermarked data can be detected and
localized to the group level with high probabilities.
K
3.1 Distortion-Free Watermarking
Algorithms 1 and 2 describe the watermark insertion process. First, a (keyed)
tuple hash and a (keyed) primary key hash are computed for each tuple r i
using a HMAC function. The tuple hash values are computed based on a fixed
 
Search WWH ::




Custom Search