Information Technology Reference
In-Depth Information
The two supervised learning models were superior to the two unsupervised mod-
els by around 20% on average in terms of precision. Figure 8 shows these results.
This result was expected and confirms what other researchers have found, in that
supervised learning techniques provide better results than unsupervised approaches.
Fig. 8 Classification Results
The SVM RBF model gave the best result in the evaluation. This model was
then applied to the remaining set of data that did not contain any frequent flyer
information. The output of this process resulted in a list of matching record pairs.
To determine all the records referring to the same passenger, the graph components
were extracted from the data set and all the passengers referring to the same entity
were assigned the same unique identifier.
6Conluion
Inferring relational information from attribute based data sets is currently one of
the few ways that large scale network data can be collected. In this chapter we ex-
plored how actors in a network can be identified before extracting the relationships
between them. The well studied area of entity resolution was surveyed and detailed
as it provides an acceptable approach and developments in the area are continually
progressing.
Actor identification is however, only the first aspect of network inference. Once
the actors are identified the relationships between the actors have to be extracted.
These relationships could in turn improve the accuracy of actor identification within
a cyclic feedback loop. In future work we aim to integrate actor identification and
relationships extraction in a common framework to infer social networks.
Search WWH ::




Custom Search