Graphics Reference
In-Depth Information
his overall impression indicates that bootstrapping indeed produces very different
models, and we see a confirmation of the tree model instability for this dataset.
With large numbers of trees, an alternative representation based on parallel co-
ordinate plots can be used. Each coordinate corresponds to a tree and each case to
avariable. hevalue of thecumulative gainforeach combination oftree and variable
isthenplottedontheaxes. heorderingofaxesisimportant toobtain acoherent pic-
ture. Some possible heuristics include ordering by the value of the most influential
variable and distance measures based on the global weight of each variable.
Data View
10.3.2
he importance and use of variables in splits is just one aspect of the tree models to
consider. In Sect. . . , we discussed another way of visualizing trees that allowed
an assessment of cut point in the data context, sectioned scatterplots. Fortunately,
sectionedscatterplotscanalsobeusedforthevisualizationofforests,preferablyusing
semitransparent partition boundaries.
Such a sectioned scatterplot of a forest is shown in Fig. . . To make the classi-
fication more di cult, we have increased the granularity of the response variable of
the olive oil data to nine regions. he sectioned scatterplot displays variables linoleic
vs. palmitoleic and partition boundaries of bootstrapped trees. he use of semi-
transparent boundaries allowsustodistinguishbetween occasionally usedcutpoints
that are shown as very faint lines and frequently used cut points shown in dark blue.
Figure . . [his figure also appears in the color insert.] Sectioned scatterplot of a forest of trees
Search WWH ::




Custom Search