Information Technology Reference
In-Depth Information
5.68 Lower status percent 6.56 GROUP 2
Accuracy: 96% Coverage: 75%
Bounds river = false GROUP 2
Accuracy: 73% Coverage: 100%
This case study illustrates the following characteristics:
- The interactive visual discovery approach has revealed new structure in the
data by visual clustering.
- We have used human visual perception to determine features of interest, and
application of the data mining algorithm has generated concrete information
about these “soft” discoveries.
- Together, interactive data mining has delivered increased knowledge about
a well known dataset.
3.2
Case Study 2: Applying HAIKU to Telecoms Data
Justification. Massive amounts of data are generated from monitoring telecom-
munications switching. Even a small company may make many thousands of
phone calls during a year. Telecommunications companies have a mountain of
data originally collected for billing purposes. Telecoms data reflects business
behavior, so is likely to contain complex patterns. For this reason, Haiku was
applied to mine this data mountain.
The data considered detailed the calling number, recipient number and du-
ration of phone calls to and from businesses in a medium sized town. Other
information available included business sector and sales channels. All identity
data was anonymized.
Call Patterns of High Usage Companies: Visualization. Anumberof
companies with particularly high numbers of calls were identified. These were
visualized separately to identify patterns within the calls of an individual com-
pany. Figure 4 shows a clustering of calls from a single company. The most
immediately obvious feature is the “blue wave” to the right of the image. This
has been labeled A. Also visible are various other structures, including the two
cluster labeled B and C.
Discoveries. After identifying these features, we then asked the system to
explain their characteristics. The following rules were discovered by the system,
and translated into sentence form for clarity.
- All calls in group A are to directory enquiries.
- Further investigation, selecting parts of the “blue wave” showed that the
wave structure was arranged by hour of day in one dimension and day of
week in the other.
- Within group B, about 70% of calls are to two numbers. 90% of all calls to
these numbers fall into group B. Almost all of the remaining 30% of calls in
Search WWH ::




Custom Search