Boosting Streaming Video Delivery with WiseReplica - Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Database Reference

In-Depth Information

6 Evaluation

The utmost goal of our performance evaluation is two-fold: (i) measure the accu-

racy of our learning model in ranking Internet videos in order of hotness ,and

(ii) evaluate the performance of our replication scheme in meeting viewers' expec-

tations in peer-assisted VoD systems. Further details about evaluation set-up are

available in Sect. 5 .

6.1 Performance Evaluation Metrics

We aim to evaluate the performance of two main WiseReplica modules: machine-

learned ranking and replication strategy. Hence we group evaluation metrics as

follows:

Machine-Learned Ranking Accuracy. We adopt the normalized Discounted

Cumulative Gain (nDCG) criterion as the main evaluation metric for our learn-

ing model. nDCG is a standard quality measure in information retrieval, espe-

cially for Web search [ 19 , 22 ]. We implement DCG measure proposed by Burges

et al. [ 8 ]. Therefore, DCG is defined as

DCG L = i =1

2 F ( i ) − 1

log 2 (1+ i ) , where

L

is the

F

i

global set of ranked videos, and

th video. To com-

pute nDCG, we divide DCG measure by the idealized DCG with perfect order of

the set

(

) is the rank position of

. Thus, the perfect model scores 1. Unlike typical information retrieval

problems, as a ranking of web content, our model does not have the notion of

query . Instead, we rely on nDCG robustness to measure the performance of our

learning model as a global ranking problem. Since the ranking problem shares

properties with both classification and regression problems, we compare nDCG

to other three popular machine learning metrics: the mean square error, a stan-

dard metric for regressions; precision, for classification; and a less robust, well-

known variant of nDCG, namely in this work nDCG(2), described by Croft et al.

in [ 12 ]. We evaluate three different state-of-the-art ensemble learning methods

available in Scikit-learn library: Random Forest, Extremely Random-

ized Trees,andGradient Tree Boosting. Moreover, we report briefly on

the sample size for learning, number of estimators or learners of ensemble meth-

ods, measurements or features importance, and the computational overhead of

our model, including memory usage and computation time for prediction.

L

Metrics for Replication Strategies in Peer-Assisted VoD Systems. Ass-

uming that content and CDN providers are committed to enforcing bitrate as

main QoS metric through SLA contracts, we consider SLA violation as the

primary performance metric. Thus, a SLA violation happens whenever the peer-

assisted VoD system does not provide the minimum average bitrate for prevent-

ing rebuffering. This measures the WiseReplica capacity of meeting consumers'

expectations. We also investigate the impact of our replication scheme using

storage domains in peer-assisted VoD systems. To this end, our evaluation met-

rics are network and storage usage. Finally, we compare WiseReplica results

with a non-collaborative caching and the Oracle-like assumption, described in

Subsect. 5.3 .

Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Search WWH ::

Custom Search

Home