Databases Reference
In-Depth Information
Figure 6. Automata characterizing stage 1
in state q2 changes its state to q4 if the user utters “this” (v2), while it changes its
state to state q3 if the user touches the object (g1). The object in state q3 changes
its state to state q4 if the user utters “this” (v2). The automaton M11 represents
this state transition.
On the other hand, if the user utters “move” (v10) to the object in state q2 or
q3, then its state changes to state q5. This state represents that the object is moving
in a straight line toward the destination until a stop instruction is issued. This
state transition is depicted by automaton M12.
If a user “grasps” (g3) an object in state q1 or q3, then it changes its state to
state q6. This is the ready-to-move state, which is awaiting a subsequent instruction
g7, where g7 represents the user's action to move the grasped object by moving
his/her hand. Since some user might grasp an object in the initial state suddenly to
move it, we allow state transition from q0 to q6 by g3, which corresponds to the
direct arrow from node 1 to 3 in Figure 5. Automaton M13 represents this situation.
Automata characterizing stage 1 are shown in Figure 6.
By similar argument, stage 2 and stage 3 are also characterized based on
automata theory. To characterize an entire object move, these nine automata are
connected using a method of cascading automata. As a result, the entire diagram
is shown in Figure 7. In this figure, a sample multi-modal user interaction is indicated
by a thick line going from q0 to q16 via q2, q4, q11, q8, and q5. First, a user
pointed his/her finger at an object (g2). Then, the user uttered “this” to specify
that this object was the correct object to move (v2). Next, the user pointed his/her
finger in the desired direction to move (g2), followed by the user uttering “there”
to identify the direction (v5), “move” (v10) to initiate motion, and finally “stop”
to finish the move (v11).
4.3 Multi-Modal Input Translator
As was shown in the previous section, a sequence of multi-modal inputs that cause
a state transition from the initial state (q0) to the final state (q16) is accepted as a
multi- modal interaction for an object move in the VWDB. Since the VWDB
Search WWH ::




Custom Search