Information Technology Reference
In-Depth Information
possibly leaving ambiguities on which skill is referred to by the verb. Secondly, the
objects mentioned in the utterance need to be mapped to entities in the world that the
robot knows about. Lastly, a skill typically has parameters, and the verb extracted from
the utterance has (multiple) objects associated to it. Hence, we need to decide which
object should be assigned to which parameter. To make things worse, it might very well
be the case that we have either too many or too few objects in the utterance for a certain
skill.
We cast understanding the command as a process where the single steps are inter-
pretation actions, that is, interpreting the single elements of the utterance. At this point
R EADYLOG and its ability to perform decision-theoretic planning comes into play. The
overall interpretation can be modelled as a planning problem. The system can choose
different actions (or actions with different parameters) at each stage. Since we want to
achieve an optimal interpretation, we make use of decision-theoretic planning. That is
to say, given an optimisation theory, we try to find a plan, i.e. a sequence of actions,
which maximises the expected reward.
Domain Specification. During the interpretation process we need to access the robot's
background knowledge. We organise this knowledge to capture generic properties and
to make individual parts available to (only) those components which need them. Three
types of information are distinguished: linguistic , interpretation ,and system . The lin-
guistic information contains everything that has to do with natural language while inter-
pretation information is used during the interpretation process and system information
features things like the specific system calls for a certain skill. The combination of these
three types is then what makes the connection from natural language to robot abilities.
We use ideas from [12] to structure our knowledge within our situation calculus-based
representation.
In an ontology, for every Skill we store a Name as an internal identifier that is being
assigned to a particular skill during the interpretation. A skill further has a Command
which is the denotation of the corresponding system call of that skill. Synonyms is a list
of possible verbs in natural language that may refer to that skill. Parameters is a list of
objects that refer to the arguments of the skill, where Name again is a reference used in
the interpretation process, Attributes is a list of properties such as whether the parameter
is numerical of string data. Significance indicates whether the parameter is optional or
required, and Preposition is a (possibly empty) list of prepositions that go with the
parameter. For the information on entities in the world (e.g. locations and objects) we
use a structure Object which again has a Name as an internal identifier used during the
interpretation. Attributes is a list of properties such as whether the object “is a location”
or if it “is portable”. Synonyms is a list of possible nouns that may refer to the object
and ID is a system related identifier that uniquely refers to a particular object.
Basic Action Theory. Now that we have put down the domain knowledge on skills
and objects, we still need to formalise the basic action theory for our interpretation
system. We therefore define three actions, namely interpret action , interpret object ,
and assign argument . For all three we need to state precondition axioms and succes-
sor state axioms. We further need several fluents that describe the properties of the
 
Search WWH ::




Custom Search