Recent Advances of Data Biclustering with Application in Computational Neuroscience - Computational Neuroscience

Information Technology Reference

In-Depth Information

is reduced to small subgroups and research on each subgroup will be easier and

more direct. Clustering has been widely studied in past 20 years, and a general re-

view of clustering is by Jain et al. in [31] while a survey of clustering algorithms is

also available by Xu et al. in [57]. The future challenges in biological networks are

available in the topic edited by Chaovalitwongse et al. in [9].

However, clustering only does the work of objects without considering the fea-

tures of each object may have. In other words, clustering compares two objects by

the features that two share, without depicting the different features of the two. A

method simultaneously groups the objects and features is called biclustering such

that a specific group of objects has a special kind group of features. More precisely,

a biclustering is to find a subset of objects and features satisfying these objects are

related to features to some level. Such kind of subsets are called biclusters. Mean-

time, biclustering does not require objects in the same bicluster to behave similarly

over all possible features, but to highly have specific features in this bicluster.

Besides the differences from clustering mentioned above, biclustering also has

the abilities to find the hide features and specify them to some subsets of objects.

We should also realize that biclustering also has relations but differences from other

techniques, such as classification, feature selection, and outlier detection in data

mining. Classification is a kind of supervised clustering while most algorithms used

in biclustering are unsupervised, and for some supervised biclustering see [4, 40].

The biclustering problem is to find biclusters in data sets, and it may have differ-

ent names such as co-clustering, two-mode clustering in some literatures.

6.1.2 Data Input

Usually, we call the objects as samples. Samples have different features and each

sample may have or may not have some features. The level of a sample having

some specific feature is called expression level. In real world, the samples may have

quantitative features or qualitative features. The expression levels of quantitative

features can be easily expressed in numerical data, while qualitative features have

to use some scale measurement to be transformed into data. For some algorithms of

biclustering, qualitative features are allowed.

Mainly, the biclustering algorithms are starting with matrices. There are two

kinds of them usually used, and the first is more possible to be used in bicluster-

ing.

•

Expression Matrix. This data matrix has rows corresponding to samples, columns

to features, with entry measuring the expression level of a feature in a sample.

Each row is called a feature vector of the sample. We can also call this matrix as

sample-by-feature matrix.

Sometimes, the matrix is formed from all samples' feature vectors, and the fea-

tures' level in this sample will be observed directly. Generally we just scale and

then put these vectors together to form a matrix if all vectors have the same

length, which means they have the same set of features. However, the feature

Search WWH ::

Custom Search

Home