A Transfer-Learning Approach to Exploit Noisy Information for Classification and Its Application on Sentiment Detection - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

3.2

Noise-Label Transfer Learning (NLTL)

We propose NLTL, which is a transfer learning model to solve the above-mentioned

problem. The overall architecture is shown in Fig. 3. The idea is to transfer informa-

tion from low-quality domain data to improve the prediction in high-quality domain

which has insufficient training instances. Note that for each object, we may integrate

corresponding instances from multiple low-quality data sources. NLTL first uses in-

stances existing in both high-quality and low-quality domains as a bridge to identify

the correlation between coarse-grained and fine-grained features. Then it learns the

weight of instances from each domain to train a binary classifier to predict testing

data in the high-quality domain. It should be noted that we perform feature transfer on

both training and testing data, however, only training data are used to learn the weight

of instances since testing data are not labeled. We define NLTL in Algorithm 1. Fea-

ture transfer is performed using Structural Corresponding Learning (SCL) [4] (Step 1

to Step 4, see 3.3), and TrAdaBoost [3] is used to tune the weight of instances (Step 5

to Step 12, see 3.4).

3.3

Feature Transferring

We want to handle the problem that the quality of features in low-quality domain is

not as good as that in high-quality domain in terms of granularity. The goal is to iden-

tify a mapping function to project the features in the low quality domain to the high

quality domain, by changing their distributions.

We propose a method based on Structural Corresponding Learning (SCL) [4]. The

intuition is to identify the correspondences among the features from different domains

by modeling their correlation with features that have similar distribution in both do-

mains. To transfer the low-quality data into high-quality domain, for each feature in

the low-quality domain, it is necessary to find its mapping to the more fine-grained

high-quality domain. Here we propose to create a prediction model to perform the

mapping. That is, for each feature in the high-quality domain, we create a classifica-

tion or regression model, for categorical and numerical features respectively, to pre-

dict its value given each corresponding instance in the low-quality domain. Assume

an user u appears in both high-quality domain (its feature vector, denoted as , is

{“Male”, “22”, “May”, “Taipei”, “Software Engineer”} ) and low-quality domain

(feature vector denoted as , which is {“Male”, “20 to 30”, “May”, “Taiwan”,

“Engineer”}). will of course be used as the training example to learn a compul-

sive user model, but we want to use as well to enlarge the training set. Therefore,

for each feature in the high-quality domain, we create a classifier that maps to a

corresponding value. In our example, we will build 4 classifiers and 1 regressor (for

'age' feature), each of which takes an instance in as input and output the possible

assignment for the fine-grained feature.

We denote these models as mapping function , and it models the correlation be-

tween the features from different domain. In the experiment we use linear regression

to learn .

· · ·

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home