Information Technology Reference
In-Depth Information
4
Pre-fetching Based on RoI Prediction
The rationale behind pre-fetching is lowering the latency of interaction. Imagine
that frame number
n
is being rendered on the screen. At this point, the user's RoI
selectionuptoframe
n
has been observed. The goal is to predict the user's RoI at
frame
n
+
d
ahead of time and pre-fetch relevant slices.
Extrapolating the Navigation Trajectory.
In our own work [63, 64], we have
used an autoregressive moving average (ARMA) model to estimate the velocity of
the RoI center:
v
t
=
α
v
t
−
1
+(1
−
α
)(
p
t
−
p
t
−
1
)
,
(2)
where, the co-ordinates of the RoI center, observed up to frame
n
,aregivenby
p
t
=
(
x
t
,
y
t
) for
t
= 0
,
1
...,
n
. The predicted RoI center co-ordinates
p
n
+
d
=(
x
n
+
d
,
y
n
+
d
)
for frame
n
+
d
are given by
p
n
+
d
=
p
n
+
dv
n
,
(3)
suitably adjusted if the RoI happens to veer off the extent of the video frame. The
prediction lookahead,
d
frames, should be chosen by taking into account network
delays and the desired interaction latency. The parameter
above trades off respon-
siveness to the user's RoI trajectory and smoothness of the predicted trajectory.
α
Video-Content-Aware RoI Prediction.
Note that the approach described above is
agnostic of the video content. We have explored video-content-aware RoI predic-
tion that analyzes the motion of objects in the video to improve the RoI predic-
tion [63, 64]. The transmission system in this work employs the multi-resolution
video coding scheme presented in Sect. 3. The transmission system ensures that
some future thumbnail video frames are buffered at the client's side. Figure 4 il-
lustrates client-side video-content-aware RoI prediction. Following are some ap-
proaches explored in [63]:
1. Optical flow estimation techniques, for example the Kanade-Lucas-Tomasi (KLT)
feature tracker [65], can find feature points in buffered thumbnail frames and
track the features in successive frames. The feature closest to the RoI center in
frame
n
can be followed up to frame
n
+
d
. The location of the tracked feature
point can be made the center of the predicted RoI in frame
n
+
d
or the predicted
RoI can be chosen such that the tracked feature point appears in the same rel-
ative location. Alternatively, a smoother trajectory can be obtained by making
a change to the RoI center only if the feature point moves more than a certain
distance away from the RoI center.
2. Depending on the chosen optical flow estimation technique, the above approach
can be computationally intensive. An alternative approach exploits MVs con-
tained in the buffered thumbnail bit-stream. The MVs are used to find a plausible
propagation of the RoI center pixel in every subsequent frame up to frame
n
+
d
.
The location of the propagated pixel in frame
n
+
d
is deemed to be the center of
the predicted RoI. Although the MVs are rate-distortion optimized and might not
reflect true motion, the results are competitive to those obtained with the KLT