Databases Reference
In-Depth Information
precision when individual performance is considered, yet it was not even part of the SMB decision
making! The most important matcher for SMB was ( Term , Intersection ) , ranked 11th according to
individual performance in terms of F-Measure and 10th in terms of precision. ( Precedence , SM ) ,
ranked second for SMB , has a mediocre individual performance. Figure 4.2 (bottom) highlights the
performance (on a precision vs. recall scale) of the four top matchers of SMB .
Our first observation is that the decision making of SMB is not linear in the individual perfor-
mance of matchers, and therefore the SMB training process is valuable. Second, we observe that SMB
seeks diversity in its decision making. It uses Term , Value (combined with Term due to its individual
poor performance), Composition , and Precedence . Given these four matchers, SMB has no need for
the Combined matcher, which provides a weighted average of the four. This explains the absence of
( Combined , Dominants ) .
As a final remark, Duchateau et al. [ 2008 ] discuss a different aspect of ensemble construction.
In their work, a set of matchers is built into a decision tree. Then, in run-time and based on
intermediate results, the ensemble suits itself to the needs of the specific matching instance. This
setting can be considered as a run-time dynamic ensemble construction as opposed to the design
time construction of SMB .
Search WWH ::




Custom Search