Building a Recommendation Engine: The XELOPES Library - Realtime Data Mining

Database Reference

In-Depth Information

// Set agent parameters:

agentSpecification.setAPValue("maxPolIter", 100);

agentSpecification.setAPValue("maxEvalIter", 200);

agentSpecification.setAPValue("theta", 0.0001);

// Create algorithm object with default values:

DPAgent agent ¼ (DPAgent) agentSpecification.createAgen-

tInstance();

// Put it all together:

agent.setAgentSettings(agentSettings);

agent.verify();

// Create DP environment:

DPEnvironment env ¼ new GridJumpEnvironment();

// Create and init simulation object:

Simulation sim ¼ new Simulation(agent, env);

sim.init(null); // assigns environment to agent

// Build DP model solving Bellman equation:

System.out.println("TRAINING");

agent.buildModel();

System.out.println( agent.getVfunction() ); // optimal

state-value function

// Run simulation:

System.out.println("SIMULATION");

int maxStepsPerTrial ¼ 10;

sim.steps(maxStepsPerTrial);

System.out.println("total time [s]: " + sim.getTimeSpent-

ToRunTrials() );

■

MC Package

The Monte Carlo algorithms are organized in the MC package. These algorithms

are simple, and the package contains basic implementations of MC algorithms like

OnPolicyMCAgent for the on-policy MC algorithm and OffPolicyMCAgent for the

off-policy MC algorithm. Consult [SB98] for these algorithms and their parameters,

whose names in XELOPES are consistent to the topic.

TD Package

The temporal-difference learning algorithms are organized in the TD package.

Examples are the classes SarsaAgent for the Sarsa, on-policy algorithm and

SarsaLambdaAgent for the Sarsa(

λ

), on-policy algorithm and WatkinsQAgent for

Search WWH ::

Custom Search

Home