Information Technology Reference
In-Depth Information
The interactive brushing technique allows the user to focus on an area (brush)
in the displayed data to highlight groups of data points. And thus, the linked
multiple views provide more information than the single one. We use the interac-
tive brushing and linking techniques and different visualization methods to try
to explain SVM results.
5.5.1
Support Vector Classification Results
For classification tasks with SVM algorithms, understanding the margin (fur-
thest distance between +1 class and -1 class) is one of the most important keys
of the support vector classification. For this purpose, we need to display the
points near the separating boundary between the two classes. To achieve this
goal, we propose to use the data distribution according to the distance from
the separating surface. While the classification task is processed (based on the
support vectors), we also compute the data distribution according to the dis-
tance from the separating surface. For each class, the positive distribution is the
set of correctly classified data points and the negative distribution is the set of
misclassified data points. The data points being near the frontier correspond to
the bar charts near the origin. When the bar charts corresponding to the points
near the frontier are selected, the data points are also selected in the other views
(visualization methods) by using the brushing and linking technique. We use
2D scatter-plot matrices for visualizing interval data. The user can see approxi-
mately the boundary between classes and the margin width. This helps the user
to evaluate the robustness of the model obtained by support vector classification.
He can also know the interesting dimensions (corresponding to the projections
providing a clear boundary between the two classes) in the obtained model.
Figure 5.6 is an example of visualizing support vector classification results
with the Segment interval dataset (class 7 against all). From data distribution
according to the distance from the separating surface, the four bar charts near the
origin are brushed, and then the corresponding points are linked and displayed
in 2D scatter-plot matrices. From the upper part of figure 5.6, we can conclude
there is a clear boundary between the two classes (there is no misclassified data
point), and from the lower part, we can see that dimensions 2 and 16 showing a
clear boundary between the two classes are interesting in the obtained model.
5.5.2
Support Vector Regression Results
We have extended this idea for visualizing support vector regression results.
We have also computed the data distribution according to the distance from
the regression function. Then we combine the histogram with 2D scatter-plot
matrices for visualization. When the user selects the data points far from the
regression function, he can know how the function fits data. If the function
well predicts the data points in high-density region then the obtained model is
interesting.
Search WWH ::




Custom Search