Suitability of the M/G/∞ Process for Modeling Scalable H.264 Video Traffic (Statistics Inference) (Analytical and Stochastic Modeling)

Abstract

Video represents a larger and larger portion of the traffic in Internet. H.264/AVC and its scalability extension have recently become some of the most widely accepted video coding standards. Consequently, adequate models for this traffic are very important for the performance evaluation of networks architectures and protocols. Particularly, the efficient and on-line generation of synthetic sample paths is fundamental for simulation studies. In this work, we check the suitability of the M/G/to process for modeling the spatial and quality scalability extensions of the H.264 standard.

Keywords: H.264 video traffic modeling; Scalability; M/G/to process; Whittle estimator.

Introduction

Improvements in network infrastructures, storage capacity and computing power, along with advances in video coding technology, are enabling an increasing popularity of multimedia applications.

Modern video transmission systems are typically characterized by a wide range of connection qualities and receiving devices with different characteristics: displays, processing power, … Scalability, that allows to remove parts of a video stream in order to adapt it to preferences of users, characteristics of terminals and changing network conditions, is a good solution for modern video transmissions.

H.264/AVC and the SVC extension [1] have recently become some of the most widely accepted video coding standards, because they have demonstrated significantly improved coding efficiency, substantially enhanced error robustness and increased flexibility and scope of applicability relative to prior video coding standards, such as H.263 and MPEG2.


Traffic modeling plays an important role in the performance evaluation of network architectures and protocols. In last decade, several traffic studies have convincingly show the existence of persistent correlations in several kinds of traffic, as VBR video [2-5] and that the impact of the correlation on the performance metrics may be drastic, and several works have been conducted in modeling VBR video traffic based on different stochastic processes [6-12] that display different forms of correlation. We focus on the M/G/ro-type processes [13, 7, 14] for its theoretical simplicity, its flexibility to exhibit both Short-Range Dependence (SRD) and Long-Range Dependence (LRD) in a parsimonious way and its advantages for simulation studies, such as the possibility of on-line generation and the lower computational cost [15].

In order to apply a model to the synthetic generation of traces with a correlation structure similar to that of real sequences, a fundamental problem is the estimation of the parameters of the model. Between the methods proposed in the literature [16-19], those based on the Whittle estimator are especially interesting because they permit to fit the whole spectral density and to obtain confidence intervals of the estimated parameters. Moreover, in [20] we have presented a method based on the prediction error of the Whittle estimator to choose, between several models for compressed VBR video traffic based on the M/G/ro process, the one that gives rise to a better adjustment of the spectral density, and therefore of the correlation structure, of the traffic to model.

In this work we check the suitability of the M/G/ro process for modeling the spatial (different spatial resolutions) and quality (different fidelity levels) scalability extensions of the H.264 standard.

The remainder of the paper is organized as follows. We begin reviewing the main concepts related to the M/G/ro process in Section 2 and those related to the Whittle estimator in Section 3. In Section 4 we explain the M/G/ro-based models that we consider in this work and in Section 5 we apply then to the modeling of H.264/SVC video traffic at the GoP level.

Thetmp4A2771_thumbprocess [21] is a stationary version of the occupancy process of an queueing system. In this queueing system, customers arrive according to a Poisson process, occupy a server for a random time with a generic distribution X with finite mean, and leave the system.

Though the system operates in continuous time, it is easier to simulate it in discrete-time, so this will be the convention henceforth [14]. The number of busy servers at timetmp4A2773_thumbis the number of arrivals at time t — i which remain active at time t, i.e., the number of active customers with age i. For any fixed t,tmp4A2774_thumbare a sequence of independent and identically distributed (iid) Poisson variables with parametertmp4A2775_thumbwheretmp4A2776_thumbis the rate of the arrival process. The expectation and variance of the number of servers occupied at time t is

tmp4A2784_thumb

The discrete-time process Yt,t = 0,1,… is time-reversible and wide-sense stationary, with autocovariance function

tmp4A2785_thumb

The function 7(h) determines completely the expected service time

tmp4A2786_thumb

and the distribution of X, the service time, because

tmp4A2787_thumb

By (1), the autocovariance is a non-negative convex function. Alternatively, any real-valued sequence 7(h) can be the autocovariance function of a discrete-time M/G/to occupancy process if and only if it is decreasing, non-negative and integer-convex [7]. In such a case,tmp4A2788_thumband the probability mass function of X is given by (1).

If Aojo (i.e., the initial number of customers in the system) follows a Poisson distribution with meantmp4A2789_thumb, and their service times have the same distribution as the residual life X of the random variable X

tmp4A2792_thumb

then {Yt, t = 0,1,… } is strict-sense stationary, ergodic, and enjoys the following properties:

1. The marginal distribution of Yt is Poissonian for all t, with mean value

tmp4A2793_thumb

2. The autocovariance function istmp4A2794_thumb

If the autocovariance function is summable the process exhibits SRD. Conversely, if the autocovariance function is not summable, the process exhibits LRD. In particular, the M/G/to process exhibits LRD when X has infinite variance, as in the case of some heavy-tailed distributions. The latter are the discrete probability distribution functions satisfyingtmp4A2795_thumbasymptotically astmp4A2796_thumb

Whittle Estimator

Lettmp4A2797_thumb) be the spectral density function of a zero-mean stationary Gaussian stochastic process, X, where 0 = (0i,…, 0M) is a vector of unknown parameters that is to be estimated from observations. Let

tmp4A2803_thumb

be the periodogram of a sample of size N of the process X. The approximate Whittle estimator [16] is the vector 0 = (01,…,0M) minimizing, for a given sample X of size N of X, the statistic

tmp4A2804_thumb

If 0° is the true value of 0, then

tmp4A2805_thumb

for any e > 0, namely, 0 converges in probability to 0° (is a weakly consistent estimator). It is also asymptotically normal, sincetmp4A2806_thumbconverges in distribution totmp4A2807_thumbis a zero-mean Gaussian vector with matrix of covariances known. Thus, from this asymptotic normality, confidence intervals of the estimated values can be computed.

A simplification of (2) arises by choosing a special scale parameter 0i, such that

tmp4A2810_thumb

and

tmp4A2811_thumbtmp4A2812_thumb

wheretmp4A2813_thumbis the optimal one-step-ahead prediction error, that is equal to the variance of the innovations in thetmp4A2814_thumbrepresentation of the process [22],

tmp4A2815_thumb

Using this normalization, equation (2) simplifies to

tmp4A2819_thumb

which is usually evaluated numerically via integral quadrature.

Additionally [22]

tmp4A2820_thumb

We usetmp4A2821_thumbas a measure of the suitability of a model, since smaller values of tmp4A2822_thumbmean better adjustment to the actual correlation of the sample.

tmp4A2823_thumbBased Models

In this work we consider as distribution for the service time the discrete-time distribution S, proposed in [14]. Its main characteristic is that of being a heavy-tailed distribution with two parameters, a and m, a feature that allows to model simultaneously the short-term correlation behavior (by means of the one-lag autocorrelation coefficient r(1)) and the long-term correlation behavior (by means of the H [23] parameter) of the occupancy process. Specifically, the autocorrelation function of the resulting M/S/to process is

tmp4A2827_thumb

with

tmp4A2828_thumb

Iftmp4A2829_thumbthentmp4A2830_thumb. Hence, in this case this correlation structure gives rise to an LRD process.

The spectral density, needed to use the Whittle estimator, is given by [19]

tmp4A2833_thumb

where /h is the spectral density of a FGN [22] process withtmp4A2834_thumbscaled by the variance.

In order to improve the adjustment of the short-term correlation of the previous process, in [24] we have proposed to add an autoregressive filter. Specifically, we focus on the particular case of an AR(1) filter.

If Y is thetmp4A2835_thumboriginal process, the new one is obtained as Yn.tmp4A2836_thumb

The mean values and covariances are related bytmp4A2840_thumb

The spectral density results

tmp4A2841_thumb

We denote the resulting process astmp4A2842_thumb

Modeling H.264/SVC Video Traffic at the GoP Level

We consider, as an example, the following empirical video traces of the Group of Pictures (GoP) sizes available at [25].

— Spatial scalability:

• T-1: "Star Wars IV (layer QCIF)".

• T-2: "Star Wars IV (layer CIF)".

— Quality scalability:

• T-3: "Star Wars IV (base layer)".

• T-4: "Star Wars IV (first enhanced layer)".

• T-5: "Star Wars IV (second enhanced layer)".

In order to adjust simultaneously the marginal distribution and the autocorrelation, as the marginal distribution in all cases is approximately Lognormal, we apply a change of distribution. In each case, A denotes the process we want to generate and C the M/G/to process from which we start off, that should have a high enough mean value so as the Poissonian marginal distribution can be considered approximately Gaussian (we select oQ = pc = 104). Moreover, we consider the intermediate process B = log(A), from which we estimate the parameters.

If A has Lognormal marginal distribution, then B = log(A) has Gaussian marginal distribution, with mean, variance and autocorrelation given by [26]

tmp4A2843_thumb

The estimations of the parameters of thetmp4A2844_thumband thetmp4A2845_thumbprocesses,computed via the Whittle estimator, are as follows:

tmp4A2848_thumb

Table 1. Estimations of the prediction error with each model

T-l

T-2

T-3

T-4

T-5

1.0007

1.0054

1.0002

1.003

1.0104

Spatial scalability. Adjustment of the autocorrelation. QCIF layer (left) and CIF layer (right).

Fig. 1. Spatial scalability. Adjustment of the autocorrelation. QCIF layer (left) and CIF layer (right).

 Spatial scalability. Adjustment of the marginal distribution. QCIF layer (left) and CIF layer (right).

Fig. 2. Spatial scalability. Adjustment of the marginal distribution. QCIF layer (left) and CIF layer (right).

In Table 1 we show the relationship between the estimations of the prediction error

tmp4A2851_thumb

The results show that, in all cases, increasing the number of parameters leads to smaller prediction errors.

Once we have generated a sample of the process C, to obtain Lognormal marginal distribution with the mean value and variance of the empirical traces, we apply the inverse transformation

tmp4A2852_thumb

beingtmp4A2853_thumbthe estimation of the variance of B computed with the Whittle estimator, that is, considering the autocorrelation structure.

Quality scalability. Adjustment of the autocorrelation. Base layer (top), first enhanced layer (bottom left), second enhanced layer (bottom right).

Fig. 3. Quality scalability. Adjustment of the autocorrelation. Base layer (top), first enhanced layer (bottom left), second enhanced layer (bottom right).

Quality scalability. Adjustment of the marginal distribution. Base layer (top), first enhanced layer (bottom left), second enhanced layer (bottom right).

Fig. 4. Quality scalability. Adjustment of the marginal distribution. Base layer (top), first enhanced layer (bottom left), second enhanced layer (bottom right).

In Figs. 1, 2, 3, and 4 we represent the autocorrelation function and the marginal distribution of synthetic traces of thetmp4A2857_thumbprocesses. We can observe a good math with the empirical traces in both cases.

Conclusions and Further Work

In this paper, we have checked the suitability of the M/G/to process for modeling the spatial and quality scalability extensions of the H.264 standard.

The proposed generators enjoy several interesting features: highly efficient, online generation and the possibility of capturing the whole correlation structure in a parsimonious way.

As further work we are going to include these traffic models in simulators of different systems, in order to use the synthetic traces for performance evaluation of scalable video transmission.

Next post:

Previous post: