Modeling and Discovering Occupancy Patterns in Sensor Networks Using Latent Dirichlet Allocation - Foundations on Natural and Artificial Computation

Information Technology Reference

In-Depth Information

Modeling and Discovering Occupancy Patterns

in Sensor Networks Using Latent Dirichlet

Allocation

Federico Castanedo 1 ,HamidAghajan 2 , and Richard Kleihorst 3

1 Computer Science Department. University Carlos III of Madrid

federico.castanedo@uc3m.es

2 Department of Electrical Engineering. Stanford University

aghajan@stanford.edu

3 Vito & Ghent University

richard.kleihorst@vito.be

Abstract. This paper presents a novel way to perform probabilistic

modeling of occupancy patterns from a sensor network. The approach is

based on the Latent Dirichlet Allocation (LDA) model. The application

of the LDA model is shown using a real dataset of occupancy logs from

the sensor network of a modern oce building. LDA is a generative and

unsupervised probabilistic model for collections of discrete data. Con-

tinuous sequences of just binary sensor readings are segmented together

in order to build the dataset discrete data (bag-of-words). Then, these

bag-of-words are used to train the model with a fixed number of topics,

also known as routines. Preliminary obtained results state that the LDA

model successfully found latent topics over all rooms and therefore obtain

the dominant occupancy patterns or routines on the sensor network.

Keywords: Probabilistic modeling, Sensor networks, Latent topics.

1

Introduction

The main objective of this paper is to show a way to perform probabilistic mod-

eling of occupancy patterns from a sensor network without having any apriori or

ground-truth information about the behaviors, thus following an unsupervised

technique.

More specifically, this paper presents an approach based on the Latent Dirich-

let Allocation (LDA) model [1] for modeling and discovering occupancy patterns

in an oce environment using a sensor network. LDA is a generative and un-

supervised machine learning probabilistic model for collections of discrete data.

In a generative machine learning algorithm, the model adjust the parameters to

produce the underlying data, so the model fits to the provided data. The LDA

model is one of the hierarchical Bayesian text models that has been proposed

in the research community. It overcomes some of the limitations that have been

reported with the probabilistic Latent Semantic Indexing (pLSI) [2], such as the

over-fitting problems. Since Blei's original paper [1], LDA has been successfully

Search WWH ::

Custom Search

Home