Information Technology Reference
Because every different task has different probability of self-loop in reality, so
self-loop probability of every task is set as 0.2 for the convenience of calculation.
If discounted factor ʳ is large, the return has greater effect on Q value than the
immediate payoff . ʳ will be set as 0.9 in this paper. A detailed description
of data sets is described in the follow:
(1) A data set from real log
The resource's average processing time for each task can be obtained from
the real log. And the social relation between two resources will be computed
according to the real log. Then calculate each resource's processing time when
it is affected by other resources.
(2) A simulated data set
Because some resources' processing time is too big compare to other resources'
processing time for the same task, which can be considered as noise of the data
set. So the resources average processing time for each task will be simulated to
almost the same size for the experiment.
6.2 Results and Analysis
Fig.2 shows the results of the Q-learning algorithm with and without on flow
time perspective. X-axis is number of execution times, and Y-axis is the average
flow time. Because the SR has less effect in early iterations, so in the first 100
executions, the resources are randomly chosen. And in the last 100 executions,
the resources with highest Q value will be chosen.
The result shows that in the last 100 executions is better than the first 100
executions no matter in the experiment with SR or not. This verifies the theory
which Q-learning algorithm can choose an optimal route for task allocation.
The resources' processing time for the same task of the real data set has a
huge difference. The randomly process may choose the resource with the longest
processing time in one execution and choose the resource with the shortest pro-
cessing time in another execution. So there are obvious fluctuations in the result
of the first 100 executions in Fig.2(a). In Fig.2(b), the average flow time of al-
gorithm with and without SR in the first 100 executions is similar, which shows
that the result of random scheduling is the same no matter considering SR or
not. And the average flow time of the algorithm with SR is lower than the one
without SR in the last 100 executions, which means taking social relation into
consideration for task allocation will improve works eciency.
Fig.3(a) shows the average flow time of 100 cases in one execution. The results
with SR have an improvement of almost 53% of the real data set. The results
minimized to 50.79 minutes with SR and 84.85 minutes without SR of the sim-
ulated data set. It has an improvement of about 40% when considering SR in
the scheduling time. The real data set has a bigger improvement because its
resources processing time has a bigger fluctuation. In other words, taking social
relation into consideration for task allocation is eciency.
Fig.3(b) shows the results of the Q-learning algorithm with and without SR
on throughput perspective. Here the throughput is the number of completed
cases in one hour. The throughput of the Q-learning algorithm without SR is