Information Technology Reference
In-Depth Information
quantization step, the introduced distortion is inaudible. A subjective test on several
music clips was carried out. In each piece, a number of frames were identified using
the HAS criterion, and data were embedded into them with various quantization steps.
Using a procedure based on the ABX method [12], a group of 10 people were inde-
pendently asked to listen to the original and the modified versions (A and B) of each
piece in a random order, and then listen once more to a randomly chosen one (X).
They were asked to tell whether X is A or B. The rates of correct identification were
roughly 50%, indicating that the data embedding is imperceptible. In contrast, adding
white Gaussian noise at similar levels is clearly audible to most listeners.
Table 1. Signal-to-noise ratio of the embedded frame
max(| C n |)
max(| C n |)/2
max(| C n |)/4
max(| C n |)/8
max(| C n |)/10
SNR(dB)
18.68
24.28
29.99
35.98
38.12
Table 2. SNR of embedded pieces. Dither steps: ∆ 1 =max(| C n |), ∆ 2 =max(| C n |)/8
SNR (dB)
Host audio
f s (kHz)
N q (bits)
T (sec)
N b
1
2
I: Classic
44.10
16
23.77
36
9,216
32.02
39.43
II: Classic
44.10
16
47.74
70
17,920
42.13
47.24
III: Pop
44.10
16
25.52
45
11,520
32.66
41.29
IV: Pop
44.10
16
46.83
87
22,272
31.63
41.32
V: Speech
22.05
8
3.47
8
2,048
31.80
41.20
VI: Speech
22.05
8
2.51
6
1,536
33.21
36.45
3.3
Robustness Test
Tests for robustness against attacks such as AWGN interference and MP3 coding
were performed on audio pieces watermarked with the largest quantization step.
Additive Noise Interference. AWGN was added to the marked audio. Fig.5 shows
the constellation of the extracted stream with QAM watermark data and a Hamming
windowed pilot. Ideally, the watermarks should all appear at four points in the com-
plex plane: 1+ j ,
j , as indicated by the thick dots. The scattered
circles represent a noise-contaminated signal at SNR=30dB referenced to the average
power of the waveform. Clearly, synchronization and accurate decode of watermark
symbols can be achieved as long as the symbols remain on the correct quadrants.
Progressively increasing noise caused errors to occur, until the search or decoding
failed. Fig.6 gives the relation between SNR and the bit error rate. Three types of
signals were used in the experiment: (1) hi-fi music with f s = 44.10 kHz, (2) speech
with f s = 22.05 kHz, and (3) low quality music or speech with f s = 8.0 kHz. The em-
bedding bandwidth was W /4. The results show that noise tolerance mainly depends on
the modulation scheme, essentially the number of bits contained in a symbol, D , and
to a much less extent on sampling frequencies and the particular type of signal.
1+ j ,
1
j , and 1
Search WWH ::




Custom Search