Information Technology Reference
In-Depth Information
early works was relatively low. For example, the hiding rate of phase coding is
around 8~32bps, while DSSS only allows 4bps [5]. This, of course, makes the tech-
niques impractical to be used in covert communication. The objective of this work is
to develop a method that can hide a substantial quantity of data into a host audio
without causing audible distortion. The proposed scheme makes use of the psycho-
acoustic masking both in the time domain and in the frequency domain to choose a
series of candidate frames. Data are inserted into the spectrum of selected short
frames of the host waveform using a technique of dither modulation. The approach
can also be viewed as an application of orthogonal frequency division multiplexing
with part of the subcarriers being the original audio spectral components modified by
the stego data. In order to acquire synchronization in detection, a pilot signal is ap-
pended to the stego data. Knowledge about the host signal and stego data is not re-
quired in extraction.
The rest of the paper is organized as follows. Section 2 discusses the methodology,
including data embedding, generation of the synchronization pilot, and extraction of
the embedded data. Section 3 describes the experiments and presents the results. Sec-
tion 4 concludes the paper. In the following discussion the two terms data hiding and
watermarking will be used interchangeably.
2
Methodology
2.1
Selection of Candidate Audio Frames
There are two types of approach for data insertion in terms of the distribution of the
hidden information. First, the embedded data are spread relatively evenly across a
long period of time or over the entire image space. The simplest LSB approach, for
example, replaces the least significant bits in all digital samples with an embedded
sequence. Although the data capacity is large, this method is susceptible to attacks.
Another example is the quantization index modulation in which several quantizers are
used to introduce perturbations to a large number of samples [7]. In a time-domain
technique, an audio signal is divided into segments, and all segments are watermarked
with the same chaotic sequence having the same length as the segments [6].
The second type is to modify brief signal segments in an audio waveform or small
areas in an image that are sparsely scattered over the entire signal. For example, a
patchwork technique [5] statistically modifies randomly chosen small image patches
according to the embedded data bit. In an audio watermarking system designed for
encoding television sound, data were embedded into selected segments distributed
over the signal [8].
The method proposed in this paper belongs to the latter category. Candidate frames
in the host audio signal are first selected and discrete Fourier transformed. Watermark
embedding is performed in the frequency domain. Studies on the HAS [1,9] indicate
that slight distortion in the neighborhood of a high volume sound is inaudible. The
masked period after a laud sound is generally longer than that prior to it. Therefore
the candidate frame is selected in a relatively quiet segment immediately after a loud
sound. The chosen segment must not be too quiet, though, in order to accommodate
sufficient strength of the embedded signal. Meanwhile, the frequency domain mask-
ing is also utilized. Spectral components adjacent to large peaks, especially on the
Search WWH ::




Custom Search