Information Technology Reference
In-Depth Information
number of entities processed by server BG which is implemented as a matrix of
frequency recorded in LTPM server.
For the Hicog and PM servers, to avoid building an ad hoc model and using the
result of the experiment to be simulated directly, nine parameters in the Hicog and
the PM servers were calculated based on previous studies (see Appendix 1).
9.3.1.2 Learning Process in the Simplest Queuing Network with two Routes
Based on the learning process of individual servers, the condition under which an
entity switches between the two routes in the simplest form of queuing networks
with two routes (each capacity equals 1) (from route 1
...
2
...
4toroute1
...
3
...
4,
see Fig. 9.6) was quantified and proved by the following mathematical deduction.
1. Q online learning equation [46]
Q t + 1
Q t
Q t
Q t
(
i
,
j
)
(
i
,
j
)+ ε {
r t + γ
max
k
[
(
j
,
k
)
(
i
,
j
)] ,
(9.8)
where Q t + 1
(
i
,
j
)
is the online Q value if entity routes from server i to server j
in t
+
1th transition; max k [
Q
(
j
,
k
)]
represents maximum Q value routing from
1); r t = μ j , t is the reward and is the pro-
cessing speed of the server j if entity enters it at t th transition; N jt represents
number of entities go to server j at t th transition;
server j to the next k server(s) ( k
ε
is the learning rate of Q on-
line learning (0
< ε <
1);
γ
is the discount parameter of routing to next server
(0
1); and p is the probability of entity routes from server 1 to server 3
does not follow the Q online learning rule if Q
< γ <
(
1
,
3
) >
Q
(
1
,
2
)
. For example,
if p
=
0
.
1, then 10% of entity will go from server 1 to server 2 even though
Q
.
State is the status that an entity is in server i ; transition is defined as an entity
routed from server i to j . Equation (9.8) updates a Q value of a backup choice
of routes ( Q ( t + 1 ) (
(
1
,
3
) >
Q
(
1
,
2
)
,
)
) based on the Q value which maximizes over all those
routes possible in the next state (max k [
i
j
(
,
)]
). In each transition, entities will
choose the next server according to the updated Q t
Q
j
k
(
i
,
j
)
.If Q
(
1
,
3
) >
Q
(
1
,
2
)
,
more entity will go from server 1 to server 3 rather than go to server 2.
2. Assumption
ε
is a constant which does not change in the current learning process (0
<
ε <
1) .
μ 4 ) is constant.
3. Lemma 9.1. At any transition state t ( t
Processing speed of server 4 (
/ μ 3 , t then Q t + 1
=
0), if 1
/ μ 2 , t <
1
(
1
,
2
) >
Q t + 1
)
Proof of Lemma 9.1 (see Appendix 2).
Based on Lemma 9.1 and Equation (9.7), we got Lemma 9.2:
4. Lemma 9.2. At any transition state t ( t
(
1
,
3
=
0), if A 2 +
B 2 Exp
( α 2 N 2 t ) <
A 3 +
then Q t + 1
Q t + 1
B 3 Exp
( α 3 N 3 t )
(
1
,
2
) >
(
1
,
3
)
.
Search WWH ::




Custom Search