Non-negative Mutative-Sparseness Coding towards Hierarchical Representation - Intelligent Computing for Sustainable Energy and Environment

Information Technology Reference

In-Depth Information

Non-negative Mutative-Sparseness Coding

towards Hierarchical Representation

Jiayun Cao and Lanjuan Zhu

Automation Department, Shanghai Jiaotong University, China

Abstract. In this work, the problem of data representation is studied.

Many existing works focus on how to learn a set of bases, sparsely and

effectively representing the data. However, as reaching a consensus, peo-

ple always arrange the data in the hierarchical structure so that a clearer

data framework and interaction can be got, just like patent documents

listed by different layers of classes. Thus, different from existing works,

we target at discovering the hierarchical representation of data. Non-

negative mutative-sparseness coding (NMSC) is a method for analyzing

the non-negative sparse components of multivariate data and represent-

ing the data as hierarchical structure. Specifically in a subsequent layer,

the sparseness of each data is adjusted according to the corresponding

hidden components in the upper layer. Our experimental evaluations

show that the NMSC possesses great eciency in clustering and su-

cient merit in hierarchical organizing the observed document data.

Keywords: basis, hidden component, non-negative, mutative sparse-

ness, hierarchical representation.

1 Introduction

Hidden components detection and tracking (HCDT) is extensively utilized in

data analysis and information processing. It aims to automatically explore the

significant latent components of data and find an optimized representation for

the data. The involved methods include latent semantic analysis (LSA), proba-

bilistic latent semantic analysis (PLSA), latent Dirichlet allocation (LDA), non-

negative matrix factorization (NMF) and non-negative sparse coding (NNSC).

In the early time, standard LSA [1] was introduced to capture and represent

significant components of the data via Singular Value Decomposition. It has

application in various pattern recognition problems where complex wholes can

be treated as additive functions of components.

Later, PLSA [2] as well as LDA [3] showed up to inherit the ideas of LSA

and enhanced the performance. Compared to LSA, PLSA adds a sounder prob-

abilistic model meanwhile serves as a three-level Bayesian model. In LDA, the

unobserved factors are assumed to have a Dirichlet prior.

To put non-negativity into constraints according to the intuitive notion, NMF

[4] was introduced in 2001. Then, each observed data can take non-negative

values in all latent semantic directions and the latent space in NMF doesnt need

to be orthogonal.

Search WWH ::

Custom Search

Home