Information Technology Reference
In-Depth Information
In the next section we illustrate the parametrization of strategy space by describ-
ing some of Walverine 's key parameters. Section 3 appeals to the TAC literature to
demonstrate the importance of accounting for strategic interactions in evaluating agent
designs. We describe the explosion of strategy profile space in Section 4, and introduce
our hierarchical reduction operator. Results from our empirical game-theoretic analysis
to date are summarized in Section 5. 1 The remaining sections describe our selection of
a particular strategy for TAC-05, and report results.
2
Walverine Parameters
TAC travel-shopping is an 8-player symmetric game, with a complex strategy space and
pivotal agent interactions. Strategies include all policies for bidding on flights, hotels,
and entertainment over time, as a function of prior observations. To focus our search,
we restrict attention to variations on our basic Walverine strategy [1], as originally
developed for TAC-02 and refined incrementally for 2003 and 2004.
We illustrate some of the possible strategy variations by describing some of the pa-
rameters we have exposed to the calling interface. To invoke an instance of Walverine ,
the user specifies parameter values dictating which version of the agent's modules to
run, and what arguments to provide to these modules.
2.1
Flight Purchase Timing
Flight prices follow a random walk with a bias that is determined by a hidden parameter
that is chosen randomly at the start of the game. Specifically, at the start of each game,
a hidden parameter x is chosen from the integers in [
10 , 30].Define x ( t )=10+
( t/ 9:00)( x
10). Every 10 seconds thereafter, given elapsed time t , flight prices are
perturbed by a value chosen uniformly, with bounds [ lb, ub ] determined by
[ x ( t ) , 10]
if x ( t ) < 0 ,
[ lb, ub ]=
[
10 , 10]
if x ( t )=0 ,
(1)
[
10 ,x ( t )] if x ( t ) > 0 .
Whereas flight price perturbations are designed to increase in expectation given no in-
formation about the hidden parameter, conditional on this parameter prices may be
expected to increase, decrease, or stay constant.
Walverine maintains a distribution Pr( x ) for each flight, initialized to be uniform
on [-10,30], and updated using Bayes's rule given the observed perturbations Δ at each
iteration: Pr( x
x ),where α is a normalization constant.
Given this distribution over the hidden x parameter, the expected perturbation for the
next iteration, E [ Δ |
|
Δ )= α Pr( x )Pr( Δ
|
x ],issimply( lb + ub ) / 2, with bounds given by (1). Averaging over
the distribution for x ,wehave E [ Δ ]= x Pr( x ) E [ Δ |
x ].
Given a set of flights that Walverine has calculated to be in the optimal package,
it decides which to purchase now as a function of the expected perturbations, current
1
Another paper presenting the hierarchical game-reduction idea and appealing to the TAC case
study was presented at AAAI-05 [5]; some of the material in Sections 4 and 5 also appears in
that work.
Search WWH ::




Custom Search