Information Technology Reference
In-Depth Information
transmit the entire picture while delivering the RoI with higher quality. Among the
class of such systems, some employ JPEG2000 with RoI support and conditional
replenishment for exploiting correlation among successive frames [16]. Parts of the
image that are not replenished can be copied from the previous frame or a back-
ground store.
In our own work, we have proposed a video transmission system for interactive
pan/tilt/zoom [17]. This system crops the RoI sequence from the high-resolution
video and encodes it using H.264/AVC. The RoI cropping is adapted to yield effi-
cient motion compensation in the video encoder. The RoI adjustment is confined to
ensure that the user does not notice the manipulation and experiences accurate RoI
control. The normal mode of operation for this system is streaming live content but
we also allow the user to rewind and play back older video. Note that in the second
mode of operation, the high-resolution video is decoded prior to cropping the RoI
sequence. Although efficient in terms of transmitted bit-rate, the drawback is that
RoI video encoding has to be invoked for each user, thus limiting the system to few
users. This system targets remote surveillance in which the number of simultaneous
users is likely to be less than other applications like interactive TV.
Video coding for spatial random access presents a special challenge. To achieve
good compression efficiency, video compression schemes typically exploit correla-
tion among successive frames. This is accomplished through motion-compensated
interframe prediction [18, 19, 20]. However, this makes it difficult to provide ran-
dom access for spatial browsing within the scene. This is because the decoding of
a block of pixels requires that other reference frame blocks used by the predictor
have previously been decoded. These reference frame blocks might lie outside the
RoI and might not have been transmitted and/or decoded earlier.
Coding, transmission and rendering of high-resolution panoramic videos using
MPEG-4 is proposed in [21, 22]. A limited part of the entire scene is transmitted to
the client depending on the chosen viewpoint. Only intraframe coding is used to al-
low random access. The scene is coded into independent slices. The authors mention
the possibility of employing interframe coding to gain more compression efficiency.
However, they note that this involves transmitting slices from the past if the current
slice requires those for its decoding. A longer intraframe period entails significant
complexity for slices from the latter frames in the group of pictures (GOP), as this
“dependency chain” grows.
Multi-View Images/Videos. Interactive streaming systems that provide virtual fly-
around in the scene employ novel-view generation to render views of the scene
from arbitrary viewpoints. With these systems, the user can experience more free
interactive navigation in the scene [23, 24, 25]. These systems typically employ
image-based rendering (IBR) which is a technique to generate the novel view from
multiple views of the scene recorded using multiple cameras [26, 27]. Note that in
these applications, the scene itself might or might not be evolving in time. Trans-
mitting arbitrary views from the multi-view data-set on-the-fly also entails random
access issues similar to those arising for transmitting arbitrary regions in interac-
tive pan/tilt/zoom. Interframe coding for compressing successive images in time as
Search WWH ::




Custom Search