Information Technology Reference
In-Depth Information
to repeat actions that result in positive feedback more often and to decrease the proba-
bility of unsuccessful interactions. Coalitions are formed when agents select the same
successful actions. A key feature of our approach is that no explicit notion of coalition is
necessary. Rather, these coalitions emerge from the global objective of the system, and
agents learn by themselves with whom they have to (de)synchronize (e.g. to maximize
throughput in a routing problem). Here desynchronization refers to the situation where
one agent's actions (e.g. waking up the radio transmitter of a wireless node) are shifted
in time, relative to another, such that the (same) actions of both agents do not happen at
the same time.
In this article, we extend our previous results by illustrating the benefits of our self-
adapting RL approach in three wireless sensor networks of different topologies, namely
line, mesh and grid. We show that nodes form coalitions which allow to reduce packet
collisions and end-to-end latency, even for very low duty cycles. This (de)synchronicity
is achieved in a decentralized manner, without any explicit communication, and with-
out any prior knowledge of the environment. Our simulations are implemented using
OMNET++, a state-of-the-art simulator [11].
The paper is organized as follows. Section 2 presents the reinforcement learning ap-
proach for solving the wake-up scheduling problem in WSN. Section 3 analyzes and
discusses the performances of the RL approach on three different topologies, namely
line, mesh and grid topologies. We also compare it to the standard S-MAC protocol and
briefly discusses future work. Section 4 concludes this paper.
2
(De)synchronicity with Reinforcement Learning
This section presents our decentralized approach to (de)synchronicity using the rein-
forcement learning framework. The proposed approach requires very few assumptions
on the underlying networking protocols, which we discuss in Section 2.1. The subse-
quent sections detail the different components of the reinforcement learning mecha-
nism.
2.1
Motivations and Network Model
Communication in WSNs is achieved by means of networking protocols, and in partic-
ular by means of the Medium Access Control (MAC) and the routing protocols [4]. The
MAC protocol is the data communication protocol concerned with sharing the wire-
less transmission medium among the network nodes. The routing protocol allows to
determine where sensor nodes have to transmit their data so that they eventually reach
the sink. A vast amount of literature exists on these two topics [4], and we sketch in the
following the key requirements for the MAC and routing protocols so that our reinforce-
ment learning mechanism presented in Section 2.2 can be implemented. We emphasize
that these requirements are very loose.
We use a simple MAC protocol, inspired from S-MAC [17], that divides the time
into small discrete units, called frames. We further divide each frame into time slots.
The frame and slot duration are application dependent and in our case they are fixed by
the user prior to network deployment. The sensor nodes then rely on a standard duty cy-
cle mechanism, in which the node is awake for a predetermined number of slots during
 
Search WWH ::




Custom Search