Database Watermarking for Copyright Protection - Database Security: Applications and Trends

Databases Reference

In-Depth Information

Single Bit Direct Domain Encoding

In [1,16] Kiernan, Agrawal et.al. propose a direct domain encoding of a single

bit watermark in a numeric relational database.

Overview. Its main algorithm proceeds as follows. A subset of tuples are

selected based on a secret criteria; for each tuple, a secret attribute and corre-

sponding least significant ( ξ ) bit position are chosen. This bit position is then

altered according to yet another secret criteria that is directly correlated to

the watermark bit to be encoded. The main assumption is, that changes can

be made to any attribute in a tuple at any least significant ξ bit positions.

At watermark detection time, the process will re-discover the watermarked

tuples and, for each detected accurate encoding, become more “confident” of

a true-positive detection.

There are a set of important assumptions underlying this method. Maybe

the most important one is that “the relational table being watermarked is such

that if all or a large number of the ξ least significant bits of any attribute are

dropped or perturbed, then the value of the data is significantly reduced.

However, it is possible to change a small number of the bits and not decrease

the value of the data significantly” [16].

The authors make an argument for this being a reasonable assumption as

such techniques have been used by publishers of topics of mathematical tables

for a long time - e.g., by introducing small errors in published logarithm

tables and astronomical ephemerides to identify pirated copies [15]. Examples

of real-world data sets that satisfy such an assumption are given, including

tables of parametric specifications (mechanical, electrical, electronic, chemical,

etc.), surveys (geological, climatic, etc.), and life sciences data (e.g., gene

expression).

Solution Details. For consistency, the original notation is used: a database

relation R with the following schema is R ( P, A 0 ,...,A ν − 1 ), is assumed, with

P the primary key attribute. All ν attributes A 0 ,...,A ν − 1 are candidates for

marking: the values are assumed such that small changes in the ξ least signifi-

cant bits are imperceptible. γ denotes a control parameter that determines the

average number ω of tuples marked ( ω = γ ), where η is the number of tuples

in the database. r.X is used to denote the value of attribute X in tuple r , α

denotes a “significance level” and τ a “threshold” for the test of “detecting a

watermark”.

K

is a key known only to the database owner, and there exists

G

,

a pseudo-random sequence number generator [23] (next(

G

) denotes the next

generated sequence number).

Note: There are a set of changes between the initial proposed scheme in [16]

and its journal version [1]. Here we discuss the (more robust) journal version.

Watermark insertion is illustrated in Figure 4. The main steps of the al-

gorithm are as follows. Initially (step 2) the random sequence generator is

initialized such that its output is distinct for any given distinct tuple value.

This mechanism is deployed in order to achieve a certain tuple ordering inde-

pendence of the encoding. The output of

G

is then used to determine: (i) if

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home