Database Reference
In-Depth Information
or the description of the dish will be analyzed. For example, optical character
recognition (OCR) can help you automatically recognize the indicated text, while
a visual search can help you identify the dish (which may not be recognized
without indication) and recommend nearby restaurants serving a similar dish.
Figure 4.7 shows three corresponding scenarios. The visual intent model consists
of two parts: visual recognition by search and social task recommendation. The first
problem is to recognize what is captured (e.g., a food image), while the second is
to recommend related entities (such as nearby restaurants serving the same food)
based on the search-based recognition results. This activity recommendation is
a difficult task in general, since visual recognition in the first step still remains
challenging. However, the advanced functionalities, such as natural multi-touch
interaction and a set of available rich context on the mobile device, bring us
opportunities to accomplish this task. For example, although one image usually
contains multiple objects, a user can indicate an object or some text of interest
through a natural gesture, so that visual recognition can be reduced to search a
similar single object. Moreover, the contextual information, such as geo-location,
can be used for location-based recommendations.
Since the visual intent is an original term, this chapter retrospects the evolution
of intent in general and walk the readers through the formation of the intent from
text, voice, and visual inputs, with both desktop-based and mobile domain-based
searches and recognition.
For desktop user intent mining, an early study on web search taxonomy is
introduced by Broder [ 110 ]. In this work, the most searched items belong to an
“informational” category, in which it sought for related information to answer
certain questions in a user's mind. A later work from Rose and Levinson further
categorized the informational class to five sub-categories, where the locate of a
product or service occupies a large percentage [ 133 ]. On the other hand, compared
to general web searches, intents derived from mobile information have strong on-
the-go characteristics. Church and Smyth conducted a diary study of user behavior
of mobile-based text search and summarized a quite different categorization from its
general web search counterpart [ 113 ]. Besides the informational category at 58
3%,
a new geographical category which is highly location dependent takes a share of
31
.
1 % of total search traffic. From a topic perspective, local services and travel &
commuting are the most popular ones out of 17 total topics, with 24
.
2%
entries respectively. It can be concluded that the on-the-go characteristics play an
important role for intent discovery and understanding on mobile devices [ 143 ].
.
2 % and 20
.
4.3.1
System Architecture
Figure 4.8 shows the architecture of TapTell . It extends Fig. 4.2 by including user
intent. This illustration can assist readers from an implementation perspective to
understand the importance in linking individual intents to final recommendations.
Search WWH ::




Custom Search