"Narrative" Information and the NKRL Solution (Artificial Intelligence)

INTRODUCTION

In a companion article of this Encyclopaedia: ‘Narrative’ Information, the Problem, we have introduced the problem of finding a complete and computationally efficient system for representing and managing ‘ nonfictional narrative information’. We have stressed there the important economic value of this multimedia type of information – that concerns, e.g., corporate memory documents, news stories, normative and legal texts, medical records, intelligence messages, surveillance videos or visitor logs, actuality photos, eLearning and Cultural Heritage material, etc. We have also emphasised that the usual Computer Science tools – including those pertaining to the now very popular ‘Semantic Web’ domain, see (Bechhofer et al., 2004, Beckett, 2004) – are not really suitable for dealing with this type of information.

BACKGROUND

In this article, we will present an Artificial Intelligence tool, NKRL (Narrative Knowledge Representation Language) that has been especially developed for dealing in an ‘intelligent’ way with the nonfictional narrative information. NKRL is, at the same time:

• a knowledge representation system for describing in the best possible detail the essential content (the ‘meaning’) of complex nonfictional ‘narratives’;

• a system of reasoning (inference) procedures that, thanks to the richness of the representation system, is able to automatically establish ‘interesting’ relationships among the represented data;

• an implemented software environment that allows the user to encode the original narratives in terms of the representation language to create ‘NKRL knowledge bases’ in a specific application domain and to exploit ‘intelligently’ these bases.

The main innovation introduced by NKRL with respect to the usual ontological paradigms concerns the addition to the traditional ontology of concepts – called HClass, ‘hierarchy of classes’ in the NKRL’s jargon – an ontology of events, i.e., a new sort of hierarchical organization where the nodes correspond to n-ary structures called ‘ templates’ (HTemp, ‘hierarchy of templates’). A partial image of the ‘upper level’ of HClass – that follows then the standard Protege approach, see (Noy et al., 2000) – is given in Figure 1; for HTemp, see Table 1 and Figure 2 below.

A SHORT DESCRIPTION OF NKRL

Instead of using the traditional (binary) attribute/value organization, the templates are generated from the n-ary combination of quadruples connecting together the symbolic name of the template, a predicate, and the arguments of the predicate introduced by named relations, the roles. The quadruples have in common the name and predicate components. Denoting then with Li the generic symbolic label identifying a given template, with Pj the predicate used in the template, with Rk the generic role and with ak the corresponding argument, the core data structure for templates has the following general format (see also the companion article, ‘Narrative’ Information, the Problem):

Figure 1. A partial representation of the ‘upper level ‘of H Class, the NKRL ‘traditional’ ontology of concepts.

Predicates pertain to the set {BEHAVE, EXIST, EXPERIENCE, MOVE, OWN, PRODUCE, RECEIVE}, and roles to the set {SUBJ(ect), OBJ(ect), SOURCE, BEN(e)F(iciary), MODAL(ity), TOPIC, CONTEXT}. An argument of the predicate can consist of a simple ‘concept’ or of a structured association (‘expansion’) of several concepts. Templates can be conceived as the formal representation of generic classes of elementary events like “move a physical object”, “be present in a place”, “produce a service”, “send/receive a message”, etc. When a particular event pertaining to one of these general classes must be represented, the corresponding template is instantiated to produce a predicative occurrence.

To represent then a simple narrative like: “On November 20, 1999, in an unspecified village, an armed group of people has kidnapped Robustiniano Hablo”, we must then select firstly in the HTemp hierarchy the template corresponding to “execution of violent actions”, see Figure 2 and Table 1 below – this example refers to a recent application of NKRL in a ‘terrorism’ context in the framework of an European project see, e.g., (Zarri, 2005).

As it appears from Table 1a, the arguments of the predicate (the ak terms in (1)) are represented by variables with associated constraints expressed as HClass concepts or combinations of concepts. When deriving a predicative occurrence (an instance of a template) like mod3.c5 in Table 1b, the role fillers in this occurrence must conform to the constraints of the father-template. For example, ROBUSTINIANO_HABLO (the ‘ BEN (e) F (iciary)’ of the action of kidnapping) and INDIVIDUAL_PERSON_20 (the unknown ‘SUBJECT’, actor, initiator etc. ofthis action) are both ‘individuals’, instances of the HClass concept individual_person. The constituents – as SOURCE in Table 1a – included in square brackets are optional. A ‘conceptual label’ like mod3.c5 is the symbolic name used to identify the NKRL code corresponding to a specific predicative occurrence.

The ‘attributive operator’, SPECIF (ication), is one of the four operators used in NKRL for the construction of ‘structured arguments’ (‘complex fillers’ or ‘expansions’) see, e.g., (Zarri, 2003). The SPECIF lists, with syntax (SPECIF e( p … p ), are used to represent the properties or attributes that can be asserted about the first element e(, concept or individual, of the list – e.g., in the SUBJ filler of mod3.c5, Table 1b, the attributes weapon_wearing and (SPECIF cardinality_ several_)) are both associated with INDIVIDUAL_PERSON_20.

The ‘location attributes’, represented in the predicative occurrences as lists, are linked with the arguments of the predicate by using the colon operator, ‘ :’, see the individual VILLAGE_1 in Table 1b. In the occurrences, the two operators date-1, date-2 materialize the temporal interval normally associated with narrative events, see (Zarri, 1998) – and, more in general, (Allen, 1981, Ferro et al., 2005).

150 templates are permanently inserted into HTemp; Figure 2 reproduces the ‘external’ organization of the PRODUCE branch of HTemp. This branch includes the Produce:Violence template used in Table 1. HTemp corresponds then to a sort of ‘catalogue’ of narrative formal structures, that are very easy to ‘customize’ to derive the new templates that could be needed for a particular application.

What expounded until now illustrates the NKRL solutions to the problem of representing ‘elementary’ (simple) events. To deal now with those ‘connectivity phenomena’ that arise when several elementary events are connected through causality, goal, indirect speech etc. links – see also (Mani and Pustejovsky, 2004) – the basic NKRL knowledge representation tools have been complemented by more complex mechanisms that make use of second order structures, see (Zarri, 2003). For example, the binding occurrences consist of lists of symbolic labels (c) ofpredicative occurrences; the lists are differentiated using specific binding operators like GOAL, CONDITION and CAUSE. Let us suppose that, in Table 1, we state now that: “…an armed group of people has kidnapped Robustiniano Hablo in order to ask his family for a ransom”, where the new elementary event: “the unknown individuals will ask for a ransom” corresponds to a new predicative occurrence, e.g., mod3.

Table 1. Building up and querying predicative occurrences

Figure 2A. Partial representation of the PRODUCE branch of HTemp, the ‘ontology of events’

c7. To represent this situation, we must add to the occurrences that represent the two elementary events a new binding occurrence, e.g., mod3.c8, to link together the conceptual labels mod3.c5 (corresponding to the kidnapping occurrence, see also Table 1b) and mod3. c7 (corresponding to the new occurrence describing the intended result). mod3.c8 will have then the form: ” mod3.c8) (G OAL mod3.c5 mod3.c7)”. The meaning of mod3.c8 can be paraphrased as: “the activity described in mod3.c5 is focalised towards (GOAL) the realization of mod3.c7″.

Reasoning in NKRL ranges from the direct questioning of an NKRL knowledge base making use of search patterns (formal queries over the contents of the knowledge base) that try to unify the predicative occurrences of the base to high-level inference procedures. A simple example of search pattern in supplied in Table 1c, producing as an answer, among other things, the predicative occurrence mod3.c5 of Table 1b – see (Ellis, 1995, Corbett, 2003, etc.) for the techniques used to unify complex conceptual structures. With respect now to the high level procedures – a detailed paper on this topic is (Zarri, 2005) – the transformation rules try to ‘adapt’, from a semantic point of view, the original query/queries (search patterns) that failed to the real contents of the existing knowledge bases. The principle employed consists in using rules to automatically ‘transform’ the original query (i.e., the original search pattern) into one or more different queries (search patterns) that are not strictly ‘equivalent’ but only ‘semantically close’ to the original one. Let us suppose that, e.g., during the search for all the possible information linked with the Robustiniano Hablo’s kidnapping, we ask the system whether Ro-bustiano Hablo is wealthy. In the absence of a direct answer, the system will automatically ‘transform’ the original query using a rule like: “In a context of ransom kidnapping, the certification that a given character is wealthy or has a professional role can be substituted by the certification that: i) this character has a tight kinship link with another person, and ii) this second person is a wealthy person or a professional people”. The final result can then be paraphrased in this way: we do not know whether Robustiano Hablo is wealthy, but we can say that his father is a wealthy businessperson, see (Zarri, 2005) for the details.

Hypothesis rules allow building up ‘reasonable’ logic/semantic connections among the data stored in an NKRL knowledge base using a number of pre-defined reasoning schemata, e.g., ‘causal’ schemata. For example, to mention a ‘classic’ NKRL example, after having directly retrieved through the use of a search pattern an information like: “Pharmacopeia, an USA biotechnology company, has received 64,000,000 dollars from the German company Schering in connection with an R&D activity”, we could be able to automatically construct a sort of ‘causal explanation’ of this event by retrieving information like: i) “Pharmacopeia and Schering have signed an agreement concerning the production by Pharmacopeia of a new compound” and ii) “in the framework of the agreement previously mentioned, Pharmacopeia has actually produced the new compound”.

In Table 2, we give the informal description of the reasoning steps (called ‘condition schemata’ in a hypothesis context) that must be validated to prove that a generic ‘kidnapping’ corresponds, in reality, to a more precise ‘kidnapping for ransom’ environment. When several reasoning steps must be simultaneously validated, as in Table 2, a failure is always possible. To overcome this problem – and, at the same time, discover all the possible implicit information associated with the original data – the two inference modes, transformation and hypotheses, can be used in an integrated way, see (Zarri, 2005). In practice, we make use of ‘transformations’ within a ‘hypothesis’ context. This means that, whenever a ‘search pattern’ is derived from a ‘condition schema’ of a hypothesis to implement one of the steps of the reasoning process, we can use it ‘as it is’ – i.e., as originally coded when the inference rule has been built up – but also in a ‘transformed’ form if the appropriate transformation rules exist within the system.

Making use of the transformation rules already existing within the system, the hypothesis represented in an informal way in Table 2 becomes, in practice, potentially equivalent to the hypothesis of Table 3. For example, the proof that the kidnappers are part of a terrorist group or separatist organization (reasoning step Cond1 of Table 2) can be now obtained indirectly, transformation T3, by checking whether they are members of a specific subset of this group or organization.

FUTURE TRENDS

NKRL is a fully implemented language/environment.

The software exists in two versions, an ORACLE-supported and a file-oriented one. Future improvements will concern mainly:

• The addition of features that will allow us querying the system in Natural Language. Very encouraging experimental results have already been obtained in this context thanks to the combined use of shallow parsing techniques – see, e.g., (Koster, 2004) and of the standard NKRL inference capabilities.

• On a more ambitious basis, the introduction of some features for the semi-automatic construction of the knowledge base of annotation/occurrences making use of full NL techniques. Some preliminary work in this context has been realised making use of the syntactic/semantic Cafetiere tools, see (Black et al, 2003, 2004).

• The introduction of optimisation techniques for the (basic) chronological backtracking of the NKRL Inference Engine, in the style of the well-known techniques developed in a Logic Programming context see, e.g., (Clark and Tarnlund, 1982).

Table 2. Inference steps for the ‘kidnapping for ransom’hypothesis

Even in its present form, NKRL has been able to deal successfully, in a ‘intelligent information retrieval’ mode, with the most different ‘narrative’ domains – from history of France to terrorism, from Falkland War to the corporate domain, from the legal field to the beauty care domain or the analysis of customers’ motivations, etc.

CONCLUSION

In this article, we have supplied some details about NKRL (Narrative Knowledge Representation Language), a fully implemented, up-to-date knowledge representation and inference system especially created for an ‘ intelligent’ exploitation of narrative knowledge. The main innovation of NKRL consists in associating with the traditional ontologies of concepts an ‘ontology of events’, i.e., a hierarchical arrangement where the nodes correspond to n-ary structures called ‘templates’.

Table 3. ‘Kidnapping’ hypothesis in the presence of transformations concerning intermediary inference steps

KEY TERMS

Attributive Operator: The ‘attributive operator’, SPECIF (ication), is one of the four operators used in NKRL for the construction of ‘structured arguments’ (‘complex fillers’ or ‘expansions’) of the conceptual predicates. The SPECIF lists, with syntax (SPECIF e. p … p ), are used to represent the properties or attributes that can be asserted about the first element e, concept or individual, of the list.

Binding Occurrences: Second order structures used to deal with those ‘connectivity phenomena’ that arise when several elementary events are connected through causality, goal, indirect speech etc. links. They consists of lists of symbolic labels (c) of predicative occurrences; the lists are differentiated using specific binding operators like GOAL, CONDITION and CAUSE.

Format of NKRL Templates: Templates take the form of n-ary combinations of quadruples connecting together the ‘symbolic name’ of the template, a ‘conceptual predicate’ and the ‘arguments’ of the predicate introduced by named relations, the ‘ roles’ (like SUBJ (ect) , OBJ (ect) , SOURCE , BEN(e)F(iciary), etc.). The quadruples have in common the ‘name’ and ‘predicate’ components. Denoting then with L. the symbolic label identifying the template, with P^the predicate, with R^ the generic role and with ak the generic argument, the core data structure for templates has the format:

Templates are included in an inheritance hierarchy, H Templates), which implements NKRL’s ‘ontology of events’.

NKRL Inference Engine: A software module that carries out the different ‘reasoning steps’ included in hypotheses or transformations. It allows us to use these two classes of inference rules also in an ‘integrated’ mode, augmenting then the possibility of finding interesting (implicit) information.

NKRL Inference Rules, Hypotheses: They are used to build up automatically ‘ reasonable’ connections among the information stored in an NKRL knowledge base according to a number of pre-defined reasoning schemata, e.g., ‘causal’ schemata’.

NKRL Inference Rules, Transformations: These rules try to ‘adapt’, from a semantic point of view, a query that failed to the contents of the existing knowledge bases. The principle employed consists in using rules to automatically ‘transform’ the original query into one or more different queries that are not strictly ‘equivalent’ but only ‘semantically close’ to the original one.

Ontology of Concepts vs. Ontology of Events: The ontologies of concepts concern the ‘standard’ hierarchical organizations of concepts to be used to model (in a ‘static’ way) a given domain. NKRL adds an ‘ontology of events’, i.e., a new sort of hierarchical organization where the nodes, represented by n-ary structures called ‘templates’, represent general classes of ‘dynamical’ events like “move a physical object”, “produce a service”, “send/receive a message”, etc.