Database Reference
In-Depth Information
4.4.2.2
Evaluation of Context-Embedded Visual Recognition
Image contextual information and its effectiveness in recognition by search tech-
nique are investigated, using the soft weighting scheme. For the bivariate-based
function
(
x
,
y
)
, The amplitude A is fixed to 1 and tuned two parameters
ʱ
and
ʲ
to modulate the standard deviation. Two sets of experimentation were conducted
with and without GPS context shown in Figs. 4.12 and 4.13 , respectively. In general,
using the soft weighting scheme improves search performance compared to the
binary weighting method. Specifically, in Fig. 4.12 ,
10 provide
the best performance for both MAP and NDCG measurements. The results of this
parameter choice using MAP and NDCG measures outperform the binary weight
method by 12 % and 15 %, respectively.
Similarly, after incorporating the GPS context, the soft weighting method again
outperformed the binary one, but in a much higher precision range. This does not
surprise us since geolocation is an important feature for differentiating objects
and their recognition, and eventually associated visual intent. Different from its
counterpart in the non-GPS scenario, Figure 4.13 demonstrates that parameter
ʱ =
50 and
ʲ =
ʱ =
5
and
1 outperforms other parameter choices, as well as the baseline binary
weighting scheme. The margin difference from the soft weighting and the binary
case has dropped to 2 % and less than 1 % for MAP and NDCG, respectively. This
result demonstrates the importance of the GPS context.
It can be observed that parameter
ʲ =
for the best
performance in both Figs. 4.12 and 4.13 . The reason is due to the fact that most
images are taken horizontally. Therefore, information is appreciated more and
weighted higher by
ʱ
is higher than parameter
ʲ
ʱ
horizontally than its counterpart
ʲ
vertically. Similar patterns
can also be observed in the following evaluations.
The significance of this image contextual information with soft weighting
scheme allows robust user behavior and is seamlessly glued with the “O” gesture,
which is spontaneous and natural. The shortcoming of the “O” is that it inevitably
suffers from lack of accuracy due to device limitations in outlining the boundary,
compared to other gestures, such as segmentation or line-based rectangular shape.
However, soft weighting alleviates this deficiency of correctness in object selection
and provides a robust method to accommodate behavioral errors when drawing the
outlines of the ROI.
4.4.2.3
Evaluation and Comparison with Contextual Image Retrieval
Model (CIRM)
State-of-the-art contextual image retrieval model (CIRM) [ 139 ] was implemented.
Its performance is compared to our context-embedded visual recognition. The
CIRM has demonstrated a promising result in desktop-based CBIR by applying
a rectangular bounding box in highlighting the emphasized region, which can be
achieved using mouse control at a desktop platform. The weighting scheme in CIRM
model is to use two logistic functions joined at the directional (either X or Y) center
of the bounding box. Then, the term frequency tf q is formulated as:
Search WWH ::




Custom Search