Information Technology Reference
In-Depth Information
Recognizing Objects in Smart Homes Based on
Human Interaction
Chen Wu and Hamid Aghajan
AIR (Ambient Intelligence Research) Lab
Stanford University, USA
airlab.stanford.edu
Abstract. We propose a system to recognize objects with a camera net-
work in a smart home. Recognizing objects in a home environment from
images is challenging, due to the variation in object appearances such
as chairs, as well as the clutters in the scene. Therefore, we propose to
recognize objects through user interactions. A hierarchical activity anal-
ysis is first performed in the system to recognize fine-grained activities
including eating, typing, cutting etc. The object-activity relationship is
encoded in the knowledge base of a Markov logic network (MLN). MLN
has the advantage of encoding relationships in an intuitive way with first-
order logic syntax. It can also deal with both soft and hard constraints by
associating weights to the formulas in the knowledge base. With activity
observations, the defined MLN is grounded and turned into a dynamic
Bayesian network (DBN) to infer object type probabilities. We expedite
inference by decomposing the MLN into smaller separate domains that
relates to the active activity. Experimental results are presented with our
testbed smart home environment.
1
Introduction
In this paper we propose a system to recognize objects and room layout through
a camera network in a smart home. Recognizing objects such as table, chair, sofa
etc. in a home environment is challenging. First, many objects such as chairs and
desks have varied appearances and shapes. Second, they are usually viewed from
the cameras from different viewpoints. Third, Cameras installed in rooms often
have a wide field of view. Images are usually cluttered with many objects while
some objects of interest may have small image size. However, many objects are
defined by their functions to users and not necessarily by their appearance. Such
objects can be recognized indirectly from human activities during interaction
with the objects.
In our work objects in the kitchen, dining room, living room and study room
are recognized based on the activity analyzed from the camera network. The
object types and activity classes in each semantic location are listed in Table 1.
We adopt a hierarchical approach for activity recognition, including coarse- and
fine-level activity recognition with different image features. In addition to the
simpler pose-related activities such as standing, sitting and lying, we are also
 
Search WWH ::




Custom Search