A Transfer-Learning Approach to Exploit Noisy Information for Classification and Its Application on Sentiment Detection - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

•

We experiment with three University of California Irvine (UCI) datasets and one

real-world dataset (Plurk) and show that our algorithm significantly outperforms

the state-of-the-art transfer learning and multi-label classification methods.

2

Related Work

The concept of transfer learning lies in leveraging common knowledge from different

tasks or different domains. In general, it can be divided into inductive and transduc-

tive transfer learning, based on the task and data [2].

TrAdaBoost [3] is an inductive instance transfer approach extended from Ada-

Boost. TrAdaBoost applies different weight-updating functions for instances in the

target domain and in the source domain. Since the distribution in the target domain is

more similar to that of the testing data, the incorrect predictions in the target domain

generally are assigned higher weights, comparing to those in the source domain.

Structural Correspondence Learning (SCL) [4] is a transductive transfer learning

with feature-representation transfer approach. It defines features with similar behavior

in both domains as pivot features and the rest as non-pivot features. Then it tries to

identify the correlation mapping functions between these features.

Our proposed algorithm belongs to transductive transfer learning, which applies

both instance and feature-representation transfer. However, the most important differ-

ence is that we deal with items that have diverse labels in different domains. Those

items are used to serve as a bridge to connect different domains.

3

Methodology

3.1

Problem Definition

We start by formulating the problem. Suppose a high-quality domain dataset D H and

N different low-quality domain dataset D L , where 1jN , are given. We define

high-quality domain data as D H x H ,y H ,…,x H H ,y H H , where n H is the

number of instance in D H , x H ∈X H represent the features of an instance, and

y H ∈Y H is the corresponding label. Here we assume low-quality domain data can

come from multiple sources, defined as D L D L ,…,D L N and |D L |n L . The

low-quality domain data from each source can be presented as

D L_ x L_ ,y L_ ,…,x L_ L_ ,y L L_ , where n L_ is the number of instance in

D L_ , x L_ ∈X L_ , and y L_ ∈Y L_ . Moreover, we assume that instances in D H contain

high quality labels and fine-grained features and those in D L have coarse-grained

features and noisy labels. Note that in general we assume n H n L , as obtaining high

quality data is more expensive and time-consuming.

Search WWH ::

Custom Search

Home