Database Watermarking: A Systematic View - Database Security: Applications and Trends

Databases Reference

In-Depth Information

is a watermarking parameter used to control the percentage of tuples being

selected. Because

S 1 is pseudo-random, roughly η/γ tuples are selected, where

η is the total number of tuples in relation R . Then, for each selected tuple, the

scheme selects one attribute with index (

S 2 mod ν )outof ν watermarkable

numerical attributes indexed from 0 to ν

1. For the selected attribute of a

selected tuple, the scheme selects one bit with index (

−

S 3 mod ξ )outof ξ least

significant bits indexed from 0 to ξ

1, where ξ is a watermarking parameter

used to control the error that each numerical value can tolerate. The scheme

then assigns the selected bit of the selected attribute in the selected tuple

with a mark value (

−

S 4 mod 2). With a probability of 1/2, the underlying

bit value is changed in this process. Due to the use of a cryptographically

secure pseudo-random sequence generator, it is computationally infeasible for

an attacker, without knowing the secret key, to derive where the watermark

bits are embedded, what the mark bits are, and the correlations among the

embedded locations and the embedded values.

For watermark detection, the scheme scans all the tuples in a suspicious

database relation R , locates the marked bit positions, and computes the mark

values at those bit positions exactly as in watermark insertion. To detect a

watermark, the scheme compares the computed mark values to the corre-

sponding bit values stored in R . A watermark is detected if the percentage of

matches in such comparison is greater than τ , where τ

≥

0 . 5 is a parameter

that is related to the assurance of the detection process.

This scheme is suitable for watermarking some numerical data since the

errors introduced in the watermarking process are under control. Parameter

ξ is used to control the errors introduced to individual values; parameter

γ is used to control the fraction of the numerical values that are modified

in watermark insertion. These two parameters can be adjusted to constrain

watermarking errors within measurement tolerance in many numerical data

sets such as meteorological data, gene expression data, parameter data on

semiconductor parts, and forest cover data [1].

2.2 Watermarking Categorical Data

Since any bit change to a categorical value may render the value meaningless,

Agrawal and Kiernan's scheme [1] cannot be directly applied to watermarking

categorical data. To solve this problem, Sion [21] proposed to watermark a

categorical attribute by changing some of its values to other values of the

attribute (e.g., “red” is changed to “green”) if such change is tolerable in

certain applications.

Sion's scheme is equivalent to Agrawal and Kiernan's scheme in selecting

a number of tuples for watermarking a categorical attribute A . The scheme

scans each tuple r and seeds a pseudo-random sequence generator

S

with a

secret key

K

in concatenation with the tuple's primary key r.P .If

S 1 , the first

S 1 mod γ = 0), then the current tuple r

is selected, otherwise the tuple is ignored, where γ controls the percentage of

S

number generated by

, satisfies (

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home