Database Reference
In-Depth Information
sensible and can be justified. In order to perform data mining tasks thoroughly, the
group decided to limit data mining to classification and clustering only, which is
sensible. The project has shown repeated attempts for both mining tasks in order to
obtain the best results. Such controlled trial-and-error activities are appropriate.
However, the group did not attempt alternative classification techniques, and did not
conduct a comparative study among the alternative approaches. It is also questionable
whether using a 2/3-1/3 split is the most effective use of the limited data. The group
considered evaluation seriously and used the evaluation results to determine a better
model. However, the group did not give the presence of the heart disease higher prior-
ity and look for models with better true positive performance. Consequently, 15
marks were given to Data Understanding, 18 to Data Preparation and Pre-processing,
15 to Data Modelling/Mining, and 14 to Post-processing. Because of the good organi-
sation and documentation of work, 7 out of 10 marks were given to the project man-
agement. With the total mark of 69%, the project is standing at the border between
good and excellent. The project did not get a clear first due to the limitations in mod-
elling/mining and evaluation.
Fig. 3. Selected Results from Project Two
3.3 Project Three: The Ugly
The Data Set
The data set used for this project is the same insurance data set used for project one.
Search WWH ::




Custom Search