Database Reference
In-Depth Information
network usage:
P min and
P max . Our replication strategy works as follows. A video
v
that has
N
replicas in peers with network capacity of
b
requires more replicas
>P max i =1 b
if the current bandwidth reservation
U
(
v
)
. Similarly, if
U
(
v
)
<
P min i =1 b
, replicas can be deleted. Otherwise, keep the replication degree.
Although this empirical approach is hard to be adopted in a real deployment, our
previous results [ 29 ] suggest that it allows us to achieve near-optimal results, pre-
venting all SLA violations, enhancing network usage and decreasing storage usage
dramatically.
5.4 Collecting the Datasets for Learning
To perform rank predictions of Internet videos, we need training datasets from
which we can learn the behaviour of video demand in peer-assisted VoD systems.
In this section, we explain the methodology to gather data for these predictions.
The training dataset of our prediction model comes from measurements of the
request arrival process on per-assisted VoD systems, as described in Subsect. 3.2 .
Each line of our training dataset has 11 values, 10 input measurements about a
video current state, and a rank position. Although, the datasets evaluated in this
work were synthetically collected by performing simulations with the Oracle-like
benchmark replication approach (detailed in Subsect. 5.3 ), similar datasets can
be collected from monitoring systems of running CDN systems.
In this work, Oracle-like benchmark replication approach (Subsect. 5.3 ) rep-
resents the near-optimal way to serve VoD service according to video encodings
and popularity, whose functioning we are very interested in learning. In this
empirical approach, a video requires additional replicas only if there exists a cer-
tain number of concurrent accesses, where concurrence is measured by checking
a high threshold of the current reserved bandwidth, as detailed in Subsect. 5.3 .
We assume that popular videos are those that have additional replicas during its
lifetime. Since Internet videos popularity distribution follows a Zipf-like distribu-
tion [ 33 ], concurrent access are rare events as well as popular videos classified by
this approach, thus it provides a quite fair approach to identify popular videos.
Raw data from Oracle-like technique permits easily distinguishing between
two ranking positions only, non-popular and popular videos, i.e. requests to
non-popular videos are all those that do not trigger any replica creation, or
those that resulted in deletions. However, there is a lack of information about
different ranking positions of popular videos. Hence, depending on the frequency
of replica creation, we add information to requests to popular videos classifying
them in popular, very popular, or viral. To define these three levels of hotness ,
we run simulations with YouTube traces, collected the distribution of replicas
creation in milliseconds, and split it in three nearly equal parts by observing the
66-percentile and 33-percentile inter-creation time for new replicas. This means
that the higher is the frequency of replica creation, the hotter is the video, and
the higher is the ranking position. Now, collected data suit model's definitions
well.
Search WWH ::




Custom Search