Database Reference
In-Depth Information
Fig. 4.7 Snapshots of TapTell with three different scenarios. A user can take a photo, specify the
object or text of his/her interest via different gestures (e.g., tap, circle, or line), and then get the
search and recommendation results through TapTell
user intent has two major limitations. First, it relies on a good recognition engine
and works well only in a relatively quiet environment. Second, there are many cases
where user intent can be naturally and conveniently expressed through the visual
form rather than text or speech (such as an unknown object or text, an artwork, a
shape or texture, and so on) [ 135 ]. As an alternative, we believe that image is a
powerful complementary carrier to express user intents on the phone.
Since intent is generally defined as “a concept considered as the product of
attention directed to an object or knowledge” [ 108 ], mobile visual intent is defined
as follows:
Definition 4.1 (Mobile Visual Intent). Mobile visual intent is defined as the intent
that can be naturally expressed through any visual information captured by a mobile
device and any user interaction with it. This intent represents user's curiosity of
certain object and willingness to discover either what it is or what associated tasks
could be practiced in a visual form.
The following shows scenarios of mobile visual intent and how expressed intent
may be predicted and connected to social tasks for recommendation. The goal is not
only to list related visual results, but also to provide rich context to present useful
multimedia information for social task recommendation.
￿
You pass by an unknown landmark that draws your attention. You can take
a picture of it. By using visual intent analysis, the related information of this
landmark is presented to you.
￿
You see an interesting restaurant across the street. Before you step into the
restaurant, you take a picture of it and indicate your interest using your gesture.
By applying visual intent analysis, the information about this restaurant or its
neighborhood points-of-interest matching your preference are recommended.
￿
You are checking a menu inside a restaurant, but you do not speak the language or
know the cuisine. You can take a photo of the menu using your phone and indicate
your intended dish or text in the photo. Your visual intent on either the photo
Search WWH ::




Custom Search