Information Technology Reference
In-Depth Information
Let us suppose that the buyer agent's requirement quantity is v . Then, at each stage k of time t , the
buyer agent can accept the seller agent's quantity on hand v k , or wait till the next time period ( k+ 1)
for another offer. However, at the end of bidding time ( t = N ), the buyer agent has to accept the open
offer, or lose everything.
Let us assume that the seller agent's quantity, v k , is a random number. The buyer agent has knowledge
only of the probability distribution of the quantity, but no future knowledge of exactly how much the
seller agent will have available in the following stage, or time period.
We define
= − , as a loss function of the buyer agent if the buyer agent accepts the offer
from the seller agent at stage k , where α is a constant. In the case of an exact match, where the seller
agent's quantity
2
L v
( )
(
v v
)
k
k
k
L v = , i.e. there is no loss. Therefore, the
buyer agent has to decide if he/she will accept the loss associated with the transaction, ( )
, it becomes evident that the loss, ( ) 0
k
q v
=
k
k
L v , or wait
k
k
till the next stage.
The traditional objective in operations research is the minimization of loss, with an implicit under-
standing that is loss is more definite than profit, and if loss is minimized profit will be increased. The
algorithm of DP as formulated to minimize loss, is stated as follows:
0
2
J q
( )
=
(
v v
)
(1)
N
N
k
2
J q
( ) min[ (
=
v v E J v
+
) , {
(
)}]
(2)
k
k
k
k
1
k
+
1
The Dynamic Programming algorithm given above defines the decision problem. Thus, the algo-
rithm defines when to accept an offer and when to reject the offer, subject to minimizing the costs of
acceptance and rejection. An optimal solution to the algorithm is an optimal strategy for minimizing
cost in the transaction, and a transaction is completed when the cost is minimal to both the seller and
buyer agents involved in the transaction.
There has to be a stopping point in the transaction, when the cost of waiting exceeds the cost of
completing the transaction. As defined in the DP formulation, the optimal stopping strategy is given
as follows:
2
Buy: if
(
v v
)
<
E J v
+
{
(
)
}
(3)
k
k
1
k
+
1
2
Wait : if
(
v v
)
>
E J v
+
{
(
)
}
(4)
k
k
1
k
+
1
Thus, the buyer agent will wait if there is a net gain in waiting, or accept the seller agent's offer if
there is no further gain, or if this is the end of the waiting period.
The formulation can be repeated in reverse order for the seller agent. Matching quantities is a de-
cision in reverse order for a seller agent, as it is for a buyer agent, because a part quantity left behind
from a transaction is a higher cost transaction from a selling perspective. Again, the transaction has to
be completed at the end of the time period, or the seller agent is left holding a quantity that has to be
disposed of. A DP formulation describes an excellent set of decision rules that exactly describe a problem
that is dynamic in nature, where little is known about future states. The essential characteristic of a DP
formulation is that, while it is exact in describing the decision problem, it is difficult to implement in
practice because so little is known about the future, and even the next state of the decision problem.
Search WWH ::




Custom Search