A Transfer-Learning Approach to Exploit Noisy Information for Classification and Its Application on Sentiment Detection - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

A Transfer-Learning Approach to Exploit Noisy

Information for Classification and Its Application

on Sentiment Detection

Wei-Shih Lin 1 , Tsung-Ting Kuo 1 , Yu-Yang Huang 1 , Wan-Chen Lu 2 , Shou-De Lin 1

1 Department of Computer Science & Information Engineering, National Taiwan University

{r00922013,d97944007,r02922050,sdlin}@csie.ntu.edu.tw

2 Telecommunication Laboratories, Chunghwa Telecom Co., Ltd

janelu@cht.com.tw

Abstract. This research proposes a novel transfer learning algorithm, Noise-

Label Transfer Learning (NLTL), aiming at exploiting noisy (in terms of labels

and features) training data to improve the learning quality. We exploit the in-

formation from both accurate and noisy data by transferring the features into

common domain and adjust the weights of instances for learning. We experi-

ment on three University of California Irvine (UCI) datasets and one real-world

dataset (Plurk) to evaluate the effectiveness of the model.

Keywords: Transfer Learning, Sentiment Diffusion Prediction, Novel Topics.

1 Introduction

This paper tries to handle the situation where there is no sufficient expert-labelled,

high quality data for training by exploiting low-quality data with imprecise features

and noisy labels. We generalize the task as a classification with noisy data problem,

which assumes both features and labels of some training data are noisy, similar to [1].

More specifically, we have two different domains of labeled training data. The first

we call it the high-quality data domain, which contains data of high quality labels and

fine-grained features. We assume it is costly to obtain such data, therefore only a

small amount of it can be obtained. The other is called the low-quality data domain,

which contains noisy data and coarse-grained features. Unlike high quality data, the

volume of this data can be large.

The example we use throughout this paper to describe our idea is the compulsive

buyer prediction problem given transaction data from different online stores (e.g.

Amazon, eBay, etc.). Let us assume the users' transaction records from different on-

line websites are obtained as our training data to train a model for compulsive buyer

classification. As shown in Fig. 1, there are some common features for users across

these stores, such as gender and month or birth. However, there are also features that

are common across different stores but have different granularity due to different

registration processes. For instance, age can be exact (e.g. 25 years old) or in a range

(e.g. 20~30), and same situation applies to locale and job categories.

Search WWH ::

Custom Search

Home