Information Technology Reference
In-Depth Information
the literature sets the stage for the discussion on video streaming with interactive
pan/tilt/zoom appearing in later sections. The later sections particularly aim at build-
ing a system that scales to large numbers of users.
2.1
Coding for Random Access
Images. Remote image browsing with interactive pan/tilt/zoom is very similar in
spirit. It is generally used for high-resolution archaeological images, aerial or satel-
lite images, images of museum exhibits, online maps, etc. Online maps provide
about 20 zoom levels. The image corresponding to each zoom level is coded into
tiles. Generally, the images corresponding to different zoom levels are coded inde-
pendently. This so-called Gaussian pyramid fails to exploit redundancy across zoom
levels but provides easy random access. The server accesses the tiles intersecting the
selected view and sends these tiles to the user. Generally, after a zoom operation, the
relevant part from the current zoom level is interpolated to quickly render the newly
desired view. As the tiles from the new zoom level arrive, the graphics become
crisper. Note that this cursory rendering based on earlier received data might not be
possible for some portions due to lack of received data.
Interactive browsing of images using JPEG2000 is explored in [7, 8]. This
leverages the multi-resolution representation of an image using wavelets. This repre-
sentation is not overcomplete unlike the Gaussian and Laplacian pyramids that gen-
erate more coefficients than the high-resolution image. JPEG2000 encodes blocks
of wavelet transform coefficients independently. This means that every coded block
has influence on the reconstruction of a limited number of pixels of the image. More-
over, the coding of each block results in an independent, embedded sub-bitstream.
This makes it possible to stream any given block with a desired degree of fidelity.
A transmission protocol, called JPEG2000 over Internet Protocol (JPIP), has also
been developed. The protocol governs communication between a client and a server
to support remote interactive browsing of JPEG2000 coded images [9]. The server
can keep track of the RoI trajectory of the client as well as the parts of the bit-stream
that have already been streamed to the client. Given a rate of transmission for the
current time interval, the server solves an optimization problem to determine which
parts of the bit-stream need to be sent in order to maximize the quality of the current
RoI.
Video. The video compression standard H.264/AVC [10, 11] includes tools like
Flexible Macroblock Ordering (FMO) and Arbitrary Slice Ordering (ASO). These
tools were primarily created for error resilience, but can also be used to define an
RoI prior to encoding [12]. The RoI can either be defined through manual input
or through automatic content analysis. Slices corresponding to the RoI (or multiple
RoIs) can be encoded with higher quality compared to other regions. Optionally, the
scalable extension of H.264/AVC, called SVC [13, 14], can be used for adding fine
or coarse granular fidelity refinements for RoI slices. The user experiences higher
quality for the RoI if the refinement packets are received. The RoI encoding param-
eters can be adapted to the network and/or the user [15]. Note that these systems
 
Search WWH ::




Custom Search