Ensemble Learning of Run-Time Prediction Models for Data-Intensive Scientific Workflows - High Performance Computing

Information Technology Reference

In-Depth Information

According to this configuration, six different datasets were generated. One of

each combination of task type (RandomR, SVM-RFE or GELF) and type of

infrastructure (homogeneous or heterogeneous)

5 Results and Analysis

In this section we present the results obtained during the experimental process.

Six different scenarios were analyzed: the three types of GEAE tasks on homo-

geneous and heterogeneous infrastructures .

For measuring the performance of each model we use the Relative Absolute

Error (RAE), which is computed as error = |p 1 −a 1 | + ... + |p m −a m |

|

100% , where,

p i and a i represent the predicted and actual values respectively for i th example.

a represents the mean value of the actual values and m is the number of testing

examples. This metric measures the deviation of predictions with respect to the

actual values. Following sections present the results obtained for the homoge-

neous and heterogeneous environments respectively, and an overall analysis of

results.

| ·

a 1 −

a

|

+ ... +

|

a m −

a

Homogeneous Environment. Table 3 presents the errors for the homoge-

neous environments. Highlighted values represent the minimum errors for each

type of task. For the RandomR task, ANN achieves the minimum error (34.1%),

but all the methods present very similar performances (except for SVR whose

error ascends to 43.1%). It is worth to point out that regardless that high errors

are evidenced, in practice these errors do not imply very negative effects because

the mean duration of tasks is very small (7.7 s).

SVM-RFE is a much more simple task to model as can be evidenced by lower

errors on the table. Once again ANN achieves the best results. The impact of

these errors is depreciable because SVM-RFE tasks have an average duration of

16.3 s.

For GELF tasks, it can be seen that the Bagging strategy presents the mini-

mum error. This error is about 20.7%, which represents a reduction of the error

ranging from 10.5% to 21.2% in comparison with the rest of the competitors. In

contraposition to the ranker tasks, large errors on the prediction of GELF task's

duration have much more undesirable consequences because the duration of the

tasks are much larger (2183.6 s).

As a general note, it can be seen that the highest errors are obtained for

RandomR, because of two reasons. First, its performance is not determined by

any parameter or characteristic of the data (the task randomly sorts the genes

without any particular input than the data). Second, its short duration is very

likely to be disturbed by other factors (i.e. background load, workflow system

overhead, etc.).

Another result to note, the ensemble method evidenced errors in the same

range than the best of the methods (ANN) with only a 1%-2% increase of the

error. In addition, for the case of GELF, the performance of ANN drops dramat-

ically becoming the worst performing method. In contrast, the ensemble method

High Performance Computing

Search WWH ::

Custom Search

Home