Databases Reference
In-Depth Information
Single Bit Direct Domain Encoding
In [1,16] Kiernan, Agrawal et.al. propose a direct domain encoding of a single
bit watermark in a numeric relational database.
Overview. Its main algorithm proceeds as follows. A subset of tuples are
selected based on a secret criteria; for each tuple, a secret attribute and corre-
sponding least significant ( ξ ) bit position are chosen. This bit position is then
altered according to yet another secret criteria that is directly correlated to
the watermark bit to be encoded. The main assumption is, that changes can
be made to any attribute in a tuple at any least significant ξ bit positions.
At watermark detection time, the process will re-discover the watermarked
tuples and, for each detected accurate encoding, become more “confident” of
a true-positive detection.
There are a set of important assumptions underlying this method. Maybe
the most important one is that “the relational table being watermarked is such
that if all or a large number of the ξ least significant bits of any attribute are
dropped or perturbed, then the value of the data is significantly reduced.
However, it is possible to change a small number of the bits and not decrease
the value of the data significantly” [16].
The authors make an argument for this being a reasonable assumption as
such techniques have been used by publishers of topics of mathematical tables
for a long time - e.g., by introducing small errors in published logarithm
tables and astronomical ephemerides to identify pirated copies [15]. Examples
of real-world data sets that satisfy such an assumption are given, including
tables of parametric specifications (mechanical, electrical, electronic, chemical,
etc.), surveys (geological, climatic, etc.), and life sciences data (e.g., gene
expression).
Solution Details. For consistency, the original notation is used: a database
relation R with the following schema is R ( P, A 0 ,...,A ν 1 ), is assumed, with
P the primary key attribute. All ν attributes A 0 ,...,A ν 1 are candidates for
marking: the values are assumed such that small changes in the ξ least signifi-
cant bits are imperceptible. γ denotes a control parameter that determines the
average number ω of tuples marked ( ω = γ ), where η is the number of tuples
in the database. r.X is used to denote the value of attribute X in tuple r , α
denotes a “significance level” and τ a “threshold” for the test of “detecting a
watermark”.
K
is a key known only to the database owner, and there exists
G
,
a pseudo-random sequence number generator [23] (next(
G
) denotes the next
generated sequence number).
Note: There are a set of changes between the initial proposed scheme in [16]
and its journal version [1]. Here we discuss the (more robust) journal version.
Watermark insertion is illustrated in Figure 4. The main steps of the al-
gorithm are as follows. Initially (step 2) the random sequence generator is
initialized such that its output is distinct for any given distinct tuple value.
This mechanism is deployed in order to achieve a certain tuple ordering inde-
pendence of the encoding. The output of
G
is then used to determine: (i) if
Search WWH ::




Custom Search