Information Technology Reference
In-Depth Information
3
Our TD Learnin
ng Method for 2048
This paper designs our TD
our method, we first chang
MS-TD learning in Subsect
in Subsection 3.4 and some
learning method for 2048 based on the method in [17]
ge the n-tuple network in Subsection 3.1, and then prop
tion 3.2. The experiments for MS-TD learning are descri
issues are discussed in Subsection 3.5.
]. In
pose
ibed
3.1
New N-Tuple Netw
work
In this paper, we use the tu
mirrored tuples. In brief, w
shape as shown in Fig. 4 (
tuples. The number of featu
uples shown in Fig. 4 (b) as well as all of their rotated
we change 4-tuples (1x4 lines) to 6-tuples in a green kn
(a). Apparently, the new 6-tuples cover all the origina
ures increases only by a factor of two or so.
and
nife
al 4-
cores in TD learning with different n-tuple networks
Fig. 5. Average s
scores in TD learning with different n-tuple networks
Fig. 6. Maximum
Search WWH ::




Custom Search