Q-learning Algorithm for Task Allocation Based on Social Relation - Process-Aware Systems

Information Technology Reference

In-Depth Information

Because every different task has different probability of self-loop in reality, so

self-loop probability of every task is set as 0.2 for the convenience of calculation.

If discounted factor ʳ is large, the return has greater effect on Q value than the

immediate payoff [15]. ʳ will be set as 0.9 in this paper. A detailed description

of data sets is described in the follow:

(1) A data set from real log

The resource's average processing time for each task can be obtained from

the real log. And the social relation between two resources will be computed

according to the real log. Then calculate each resource's processing time when

it is affected by other resources.

(2) A simulated data set

Because some resources' processing time is too big compare to other resources'

processing time for the same task, which can be considered as noise of the data

set. So the resources average processing time for each task will be simulated to

almost the same size for the experiment.

6.2 Results and Analysis

Fig.2 shows the results of the Q-learning algorithm with and without on flow

time perspective. X-axis is number of execution times, and Y-axis is the average

flow time. Because the SR has less effect in early iterations, so in the first 100

executions, the resources are randomly chosen. And in the last 100 executions,

the resources with highest Q value will be chosen.

The result shows that in the last 100 executions is better than the first 100

executions no matter in the experiment with SR or not. This verifies the theory

which Q-learning algorithm can choose an optimal route for task allocation.

The resources' processing time for the same task of the real data set has a

huge difference. The randomly process may choose the resource with the longest

processing time in one execution and choose the resource with the shortest pro-

cessing time in another execution. So there are obvious fluctuations in the result

of the first 100 executions in Fig.2(a). In Fig.2(b), the average flow time of al-

gorithm with and without SR in the first 100 executions is similar, which shows

that the result of random scheduling is the same no matter considering SR or

not. And the average flow time of the algorithm with SR is lower than the one

without SR in the last 100 executions, which means taking social relation into

consideration for task allocation will improve works eciency.

Fig.3(a) shows the average flow time of 100 cases in one execution. The results

with SR have an improvement of almost 53% of the real data set. The results

minimized to 50.79 minutes with SR and 84.85 minutes without SR of the sim-

ulated data set. It has an improvement of about 40% when considering SR in

the scheduling time. The real data set has a bigger improvement because its

resources processing time has a bigger fluctuation. In other words, taking social

relation into consideration for task allocation is eciency.

Fig.3(b) shows the results of the Q-learning algorithm with and without SR

on throughput perspective. Here the throughput is the number of completed

cases in one hour. The throughput of the Q-learning algorithm without SR is

Process-Aware Systems

Search WWH ::

Custom Search

Home