Database Reference
In-Depth Information
Additionally, a large portion of the SDLC process is based on background knowledge
of personnel involved. A DM technique should learn to incorporate the priori
knowledge in its process.
Another aspect of DM that can be a problem is the presentation and visualization
of the complex results. Output of a mining process is usually a large number of
meaningful rules. However the representation of these rules to assist a project
manager in making strategic directions requires significant post-processing.
Performance Issues: These include efficiency, scalability, and user effectiveness of
data mining algorithms and tools. The performance metrics assessing the
appropriateness of DM methods to SDLC includes robustness, scalability, automatic
pre-processing capability, reliability, noise tolerance and sensitivity analysis [6]. A DM
tool should be able to include all (or majority) of these to get the user satisfaction.
3 A Case Study: Analysis of Problem Report Data
This section describes the application of DM techniques to the software Problem
Report (PR) management data of a large global telecommunication company. When a
problem is reported, the responsible team can only approximately suggest the efforts
(time) to fix the problem based on their previous experience. If the current project is
not within their familiar topics, the accuracy of the estimation becomes worse.
The goal of this mining process is to provide estimation of effort to fix when a
problem is raised. The results will reveal the hidden relationships in data, such as:
How long does it take to fix a problem when a particular type of PR is raised?
What type of project documents needs significant efforts to fix the associated bug?
This will bring great cost savings and benefits to the organisation by the improved
control over the PR fixing and an accurate project planing, estimation and progress
control. The results will especially be useful to developers in problem reasoning.
When a programmer is struggling with a bug, a resolution can be suggested from the
knowledge inferred from the previous similar problems stored in PRs.
3.1 Data Pre-processing
The first task in the process is to prepare the data set according to the DM techniques.
Field Selection: The PR data consists of textual information, categorical and
numerical fields. Several fields such as confidential, submitter-ID, environment, fix,
release note, audit trail, the associated project name and the PR number are ignored
during mining. These are used in pre-processing and post-processing stages to assist
in the selection of data and a better understanding of the rules being found.
Whenever a PR is raised, a project leader will have to find answers for the
following questions before taking any action:
How severe the problem is (customer impact)?
What is the impact of the problem on project schedule (Cost & Team priority)?
What type of the problem it is (a Software bug or a design flaw)?
Search WWH ::




Custom Search