Information Technology Reference
In-Depth Information
Through a shading mode parameter, bid shading can be turned off, in which case
Walverine bids its marginal value. Another parameter defines a shade percentage , spec-
ifying a fixed fraction to bid below marginal value. There are two modes corresponding
to the optimal shading algorithm, differing in how they model the other agents' value
distributions. In the first, the distributions are derived from a simplified competitive
analysis. For this mode, another parameter, shade model threshold turns off shading in
case the model appears too unlikely given the price quote. Specifically, we calculate
the probability that the 16th highest bid is greater than or equal to the quote according
to the modeled value distributions, and if too low we refrain from using the model for
shading. For the second optimal shading mode, instead of the competitive model we
employ empirically derived distributions keyed on the hotel closing order.
2.3
Entertainment Trading
We choose among a discrete set of policies for trading entertainment. As a baseline, we
implemented the strategy employed by livingagents in TAC-01 [6]. We also applied re-
inforcement learning to derive policies from scratch, expressed as functions of marginal
valuations and various additional state variables. The policy employed by Walverine
in TAC-02 was derived by Q-learning over a discretized state space. For TAC-03 we
learned an alternative policy, this time employing a neural network to represent the
value function. Our analysis of other agents indicated that Whitebear performs partic-
ularly well in entertainment trading. Therefore, we also implemented an entertainment
module based on the Whitebear policy, 2 adapted for the Walverine architecture.
2.4
Other Parameters
Walverine predicts hotel prices based on competitive equilibrium analysis [2]. The
result, however, does not account for uncertainty in the predictions. We developed a
simple method to hedge on our price estimates, by assigning an outlier probability to
the event that a hotel price will be much greater than predicted. We can hedge to a
greater or lesser degree by modifying this outlier parameter.
Given a price distribution, one could optimize bids with respect to the distribution
itself, or with respect to the expected prices induced by the distribution. Although the
former approach is more accurate in principle, necessary compromises in implementa-
tion render it ambiguous in practice which produces superior results [2,3,7]. Thus, we
include a parameter controlling which method to apply in Walverine .
Several agent designers have reported employing priceline predictions, accounting
for the impact of one's own demand quantity on price. We implemented a version of
the completion algorithm [8] that optimizes with respect to pricelines, and included it
as a Walverine option. A further parameter selects how price predictions and optimiza-
tions account for outstanding hotel bids in determining current holdings. In one setting
current bids for open hotel auctions are ignored, and in another the current hypothetical
winnings are treated as actual holdings.
2
Thanks to Ioannis Vetsikas for providing a version of the 2003 source code for Whitebear .
Search WWH ::




Custom Search