Information Technology Reference
In-Depth Information
earlier. This is different from prior work [61] employing a background pyramid, in
which the encoder uses only those parts of the background for prediction that exist
in the decoder's multi-resolution background pyramid. In [61], the encoder mim-
ics the decoder which builds a background pyramid out of all previously received
frames. Note that the camera is likely to be static in such applications since a moving
camera might hamper the interactive browsing experience. Background extraction
is generally easier with a static camera. Background extraction algorithms as well
as detection and update of changed background portions have been previously stud-
ied, for example in [62]. Note that the improved coding scheme entails transmitting
some I slices from the background frame that might be required for decoding the
current high-resolution P slice. Nevertheless, the cost of doing this is amortized
over the streaming session. Bit-rate reduction of 70-80% can be obtained with this
improvement while retaining efficient random access.
Optimal Slice Size. Generally, whenever tiles or slices are employed, choosing the
tile size or slice size poses the following trade-off. On one hand, a smaller slice size
reduces the overhead of transmitted pixels. The overhead is constituted by pixels
that have to be transmitted due to the coarse slice grid but are not used for rendering
the display. On the other hand, reducing the slice size worsens the coding efficiency.
This is due to increased number of headers and inability to exploit correlation across
the slices. The optimal slice size depends on the RoI display dimensions, the dimen-
sions of the high-spatial-resolution video, the content itself and the distribution of
the user-selected zoom-factor. Nevertheless, we have demonstrated in prior work
that stochastic analysis can estimate the expected number of transmitted pixels per
frame [56]. This quantity, denoted by
( s w , s h ), is a function of the slice width, s w
and the slice height, s h . The average number of bits per pixel required to encode
the high-resolution video frame, denoted by
ψ
( s w , s h ), can also be observed or esti-
mated as a function of the slice size. The optimal slice size is the one that minimizes
the expected number of bits transmitted per frame,
η
( s op w , s op h )=arg min
( s w , s h ) η
( s w , s h )
× ψ
( s w , s h ) .
(1)
The results in our earlier work show that the optimal slice size can be determined
accurately without capturing user-interaction trajectories [56]. Although the model
predicts the optimal slice size accurately, it can underestimate or overestimate the
transmitted bit-rate. This is because the popular slices that constitute the salient ob-
jects in the video might entail high or low bit-rate compared to the average. Also, the
location of the objects can bias the pixel overhead to the high or low side, whereas
the model uses the average overhead. Note that the cost function in (1) can be re-
placed with a Lagrangian cost function that minimizes the weighted sum of the
average transmission bit-rate and the incurred storage cost. The storage cost can be
represented by an appropriate constant multiplying
η
( s w , s h ).
 
Search WWH ::




Custom Search