Database Watermarking: A Systematic View - Database Security: Applications and Trends

Databases Reference

In-Depth Information

tuples selected. Given η tuples in relation R , roughly η/γ values of attribute

A are selected for watermark insertion.

For each selected tuple r , exactly one bit is chosen from watermark infor-

mation wm data and is embedded to r.A , where the watermark information

wm data consists of roughly η/γ bits generated from a shorter watermark wm

using error correcting code (ECC). The bit position that is chosen is deter-

mined by mapping

S 2 uniformly to the range of wm data indexes (this can

be done by using a pseudo-random number generator or using an embedding

map as stated in [21]). To embed the chosen bit b , the current categorical

value r.A is changed to another valid value of A , which is chosen from a list

L A of all valid values of A . In this process, any value a canbechosenfrom L A

(to replace r.A )aslongas a 's index in L A has the least significant bit b . This

flexibility in value selection can be exploited to maintain certain distribution

properties of A in watermark insertion.

For watermark detection, a number of tuples are selected the same way

as in watermark insertion, based on

S 1 . Then, for each selected tuple r , a bit

position in wm data is located the same way as in watermark insertion, based

on

S 2 . The corresponding bit value in wm data is extracted from the least

significant bit of the index of r.A in the list L A . After all of the tuples are

processed, the ECC takes as input wm data and produces the correspond-

ing wm . The ECC can tolerate certain errors in detecting wm data and still

produce the same wm in watermark detection.

This watermarking scheme has been applied to binned medical data in a

hierachical manner so as to protect copyright in the presence of generalization

attack that is specific to the binned data [2].

3 Distortion

While the watermarking errors introduced to numerical values can be made

small, thus tolerable to certain data applications as illustrated in [1], the

errors introduced to categorical data can be significant, at least to individual

values. In [13], Li, Guo, and Jajodia introduced a distortion-free scheme for

watermarking categorical data (it can also be directly applied to watermarking

numerical data with no errors). In this solution, all η tuples in a database

relation R are first securely divided into g groups according to a secret key

.

A different watermark is embedded and verified in each group independently.

As a result, any modifications to the watermarked data can be detected and

localized to the group level with high probabilities.

K

3.1 Distortion-Free Watermarking

Algorithms 1 and 2 describe the watermark insertion process. First, a (keyed)

tuple hash and a (keyed) primary key hash are computed for each tuple r i

using a HMAC function. The tuple hash values are computed based on a fixed

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home