Databases Reference
In-Depth Information
•
Verifiability
: A watermark solution is said to be
private
if the detection of
a watermark can only be performed by someone who owns a secret key
and can only be proven once to the public (e.g., to the court). After this
one-time proof, the secret key is known to the public and the embedded
watermark can be easily destroyed by malicious users. A watermark solu-
tion is said to be
public
if the detection of a watermark can be publicly
proven by anyone, as many times as necessary.
•
Data structure
: Different watermarking schemes are designed to accom-
modate different structural information of the underlying data, including
relational databases (with or without primary keys), data cubes, streaming
data, and XML data.
Based on the above classification model, a systematic review of database
watermarking is presented in the rest of this chapter. The review covers typ-
ical watermarking schemes as well as some new results, not intended to be
complete, but rather to present a coherent picture from the author's point
of view. The reader is referred to another chapter in this topic for a comple-
mentary review on database watermarking schemes. For consistency reasons,
the same notation is used as much as possible in this interpretation of differ-
ent schemes (thus the interpretation may not be exactly the same as those
given in the literature). In particular, this interpretation uses a cryptograph-
ically secure pseudo-random sequence generator
S
seeded with a secret key
K
in concatenation with some other input
X
. Given a sequence of numbers
S
1
,
, it is computationally infeasible to derive the secret
key nor to predict the next number in the sequence. Alternatively, one can use
a keyed hash message authentication code
HMAC
to generate the sequence
with different secret keys (i.e.,
S
2
,...
generated by
S
S
i
=
HMAC
(
K
i
,X
)).
2 Data Type
2.1 Watermarking Numerical Data
The first well-known database watermarking scheme was proposed by
Agrawal and Kiernan [1] for watermarking numerical values in relational
databases. The fundamental assumption is that the watermarked database
can tolerate a small amount of errors: it is acceptable to change a small num-
ber of
ξ
least significant bits in some numeric values; however, the value of
data is significantly reduced if a large number of the bits are changed. The ba-
sic idea is to ensure that those bit positions contain specific values determined
by a secret key
. The bit pattern constitutes a watermark.
For watermark insertion, the scheme scans each tuple
r
in a relation
R
and seeds a cryptographically secure pseudo-random sequence generator
K
S
with the secret key
K
in concatenation with the tuple's primary key
r.P
.
S
1
mod
γ
= 0),
then the current tuple
r
is selected, otherwise the tuple is ignored, where
γ
S
i
be the
i
-th number generated by
S
S
1
satisfies (
Let
.If