Information Technology Reference
In-Depth Information
Table 1. Experiment Results in AUC
CTG
Magic
Wine
88.35%
75.49%
69.95%
High-Quality
85.58%
76.91%
73.89%
Low-Quality_1
85.14%
58.81%
66.28%
Low-Quality_2
89.20%
78.22%
75.37%
All Instance
TrAdaBoost
89.56%
81.28%
76.52%
SCL
86.58%
76.04%
69.42%
85.77%
78.82%
71.29%
Label-Powerset
NLTL
91.83%
81.71%
76.63%
4
Experiments
4.1
Dataset and Settings
We test our model on three datasets (CTG, Magic, and Wine) collected from UCI
Machine Learning Repository [5]. We preprocess the labels to binary classes in our
experiment. The three datasets contain 2126, 19020, and 6497 instances and 21, 10,
and 11 features, respectively. For each dataset, we use original features and labels as
high-quality domain data. To generate noisy low-quality domain data, we randomly
pick c % of instances, flip their labels, and modify their features to be coarser. For
example, for a numerical feature, we quantize its values into K groups, and assign the
medium value for each group as the new feature value. In our experiment, we gener-
ate two low-quality domain datasets with c,K = (20, 5) and c,K50,10 . To
reflect the fact that correctly labeled data are rare, we randomly choose 10% of high-
quality domain data for training and keep the remaining for testing. We use 4-fold
cross validation for evaluation.
We choose area under ROC curve (AUC) as the evaluation metric because of data
imbalance. We rank the testing instances base on the predicted positive probability,
and then compare it to the ground truths to produce AUC. For weight tuning, we ma-
nually assign the largest weight to 10 and ʱ to 0.7. That is, the second largest weight
is 7, third is 4.9, and so on. We compare our model with three types of algorithms,
traditional non-transfer learning (High-Quality, Low-Quality_1, Low-Quality_2 and
All Instance), transfer learning (TrAdaBoost and SCL), and multi-label (Label-
Powerset) algorithms.
4.2
Results
We show the results comparing other baselines to NLTL in Table 1. The best results
are marked in bold.
Search WWH ::




Custom Search