Q-learning Algorithm for Task Allocation Based on Social Relation - Process-Aware Systems

Information Technology Reference

In-Depth Information

visits n ( s, a ) is the accessed total number of the state-action pair (s,a) within n

loops.

And in order to facilitate the iterative approximation of Q ( s, a ), one state-

action pair (s,a) is mapped to a Q value. In this paper, there is:

( s, a )=(( t, wl ) ,a )=( t, wl, a )

ↆ

Where, T and R are finite sets. Let t

∈

T is the task will be allocated. The

workload of one resource r

∈

candidates ( t ) is defined as wl ( r ):

⊧

⊨

FREE, (

r.wl

=0)

wl ( r )=

LOW, (

r.wl

|≤

avg wl ( t ))

(7)

⊩

HIGH, (

r.wl

>avgwl ( t ))

Here, the function avg wl ( t ) is the average length of all candidate resources'

work list for the task t . For example, if there are no enabled work items that are

allocated to the resource r , wl ( r )= FREE , and if the work list's length of the

resource r is less (higher) than the avg wl ( t ), wl ( r )= LOW ( HIGH ).

Intuitively, the estimation function Q ( s, a ) can be mapped to a big table:

Q = Q Table ( t, r, wl ( r ))

For instance, Q Table ( t 1 ,r 1 ,FREE ) is a value achieved from formula (6) when

task t 1 is allocated to resource r 1 and its workload is FREE .

Table1 shows the main steps of Q-learning algorithm for task allocation with-

out SR as follow:

Tabl e 1. Q-learning Algorithm without SR

1Foreach s and a ,the Q Table value is initialized as 0.

2 Get the current state s

3 Repeat until the system stop:

Choose the resource r pick according to the value of ˆ

Q ( s,a ) ,a← r pick ,and

then allocate the task t to r pick

Allocate processing time for the task

Save ( s,a ) according to the corresponding case

A new enabled item is came or a work item is completed.

Get a new state s

ead ( s,a ) according to the new case ˃

Receive the reward r

Update the Q Table value according to the following formula:

Q n ( s,a ) ← (1 − ʱ n ) ˆ

Q n− 1 ( s,a )+ ʱ n [ r + ʳmax a ˆ

Q n− 1 ( s ,a )]

s ← s , ˃ ← ˃

Process-Aware Systems

Search WWH ::

Custom Search

Home