Reinforcement Learning for Self-organizing Wake-Up Scheduling inWireless Sensor Networks - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

to repeat actions that result in positive feedback more often and to decrease the proba-

bility of unsuccessful interactions. Coalitions are formed when agents select the same

successful actions. A key feature of our approach is that no explicit notion of coalition is

necessary. Rather, these coalitions emerge from the global objective of the system, and

agents learn by themselves with whom they have to (de)synchronize (e.g. to maximize

throughput in a routing problem). Here desynchronization refers to the situation where

one agent's actions (e.g. waking up the radio transmitter of a wireless node) are shifted

in time, relative to another, such that the (same) actions of both agents do not happen at

the same time.

In this article, we extend our previous results by illustrating the benefits of our self-

adapting RL approach in three wireless sensor networks of different topologies, namely

line, mesh and grid. We show that nodes form coalitions which allow to reduce packet

collisions and end-to-end latency, even for very low duty cycles. This (de)synchronicity

is achieved in a decentralized manner, without any explicit communication, and with-

out any prior knowledge of the environment. Our simulations are implemented using

OMNET++, a state-of-the-art simulator [11].

The paper is organized as follows. Section 2 presents the reinforcement learning ap-

proach for solving the wake-up scheduling problem in WSN. Section 3 analyzes and

discusses the performances of the RL approach on three different topologies, namely

line, mesh and grid topologies. We also compare it to the standard S-MAC protocol and

briefly discusses future work. Section 4 concludes this paper.

2

(De)synchronicity with Reinforcement Learning

This section presents our decentralized approach to (de)synchronicity using the rein-

forcement learning framework. The proposed approach requires very few assumptions

on the underlying networking protocols, which we discuss in Section 2.1. The subse-

quent sections detail the different components of the reinforcement learning mecha-

nism.

2.1

Motivations and Network Model

Communication in WSNs is achieved by means of networking protocols, and in partic-

ular by means of the Medium Access Control (MAC) and the routing protocols [4]. The

MAC protocol is the data communication protocol concerned with sharing the wire-

less transmission medium among the network nodes. The routing protocol allows to

determine where sensor nodes have to transmit their data so that they eventually reach

the sink. A vast amount of literature exists on these two topics [4], and we sketch in the

following the key requirements for the MAC and routing protocols so that our reinforce-

ment learning mechanism presented in Section 2.2 can be implemented. We emphasize

that these requirements are very loose.

We use a simple MAC protocol, inspired from S-MAC [17], that divides the time

into small discrete units, called frames. We further divide each frame into time slots.

The frame and slot duration are application dependent and in our case they are fixed by

the user prior to network deployment. The sensor nodes then rely on a standard duty cy-

cle mechanism, in which the node is awake for a predetermined number of slots during

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home