Information Technology Reference
In-Depth Information
Fig. 1 Network Inference Process
the correct identification of network actors. This is a non-trivial process when the
concept of “actor” isn't at the forefront of the system design which is collecting
non-relational data. As such, this step involves identifying all the cases where actors
are not properly represented, typically when appearing under different identifiers in
different records. This task is critical if there is no unique identifier that identifies
each actor unambiguously.
Network actor identification is a reformulation of the “entity resolution” problem
that is frequently encountered in different areas of computer science (see section 3).
Entity Resolution approaches can be divided into two categories, namely attribute
based approaches and relational based approaches. Attribute based approaches, dis-
cussed in section 4.1, consider all the data elements independently and do not exploit
relationships, whether present or not, between data elements. Relational approaches,
discussed in section 4.2 use an identified network structure as additional information
to improve the quality of the entity resolution.
When inferring a network, relationships are not always trivial to infer. Ambigu-
ous definitions of relationships, different types of relationships, different measures
of relationship strength [43, 46] and lack of concrete supporting evidence in the
data, can make the process of relationship identification complex. Furthermore, if
relationship data is not available then relational entity resolution techniques cannot
be employed as there is no network data available.
In the network inference framework illustrated in Figure 1 we propose a cyclic
process whereby actors are first resolved using attribute based entity resolution and
then improved upon following the initial relationship identification stage. The cy-
cle between identifying actors and identifying relationships can be refined progres-
sively, in both directions. The relationship information can be used to improve the
quality of matching identical entities, while the observations from actor identifica-
tion can prompt rules to identify new types of relationships. In the second part of
this chapter, in section 5, we use a real world case study to illustrate the steps within
the first stage of actor identification using attribute information. Future work will
describe how relationships are identified from the data set and how this information
can be fed back to improve the quality of the actor identification process.
 
Search WWH ::




Custom Search