Database Reference
In-Depth Information
A statistical analysis of the first four years, which in most cases reveals
statistically significant result over most common hypothesis tests and analysis,
will derive a stable model. This is obvious as the change-detection metric falls to
only 87% significance for the four years of accumulated data. Usually the basic
inductive learning hypothesis states that any hypothesis found to approximate the
target function well over a sufficiently large set of training examples will also
approximate the target function well over unobserved examples.
But analysis of the fifth year data does not support this hypothesis. The change-
detection method reveals that the model, which was observed over the first four
years, is not stationary. All three metrics, which are combined through the change-
detection procedure, detect a highly significant process change (more than 1%
confidence level).
To Illustrate how our method detects real-world significant changes, we quote
the manager of science and technology adminstration in the Ministry of Education,
Culture and Sports in Israel, as delivered at the end of year 2000: “The
administration has finished the new planning of organizing studies for the
technological road, many courses has been altered . . . We are preparing for a
pioneer experiment, which involves about 40 educational institutes, in which the
programs, their implementation and integration will be evaluated.”
To further demonstrate the impact of the change in the models from years 1996
to 1999, opposed to the model using the five years of data, illustrated in Fig. 4.2 is
the decision tree, which is derived by the IFN algorithm from the data set that
includes 1996 to 2000 (according to the largest connection weight of the extracted
set of fuzzy rules (M. Last, et al. [25]). The bolded nodes of the tree represent
states, which involve different forecasts from including or excluding the year 2000
from the database.
The expected error rate from using the same set of rules, based on 1996 to
1999 over the year 2000 and beyond, will produce at least 22% error on average,
as shown in Table 4.5.
Table 4.5. Comparison of decision trees induced from “Dropout” data set including and
excluding year 2000.
Layer no.
No. of rules
Mismatches
Mismatch percentage
0
1
0%
1
2
0%
2
12
3
25%
3
27
5
19%
4
19
7
37%
5
6
0%
6
2
0%
sum
69
15
22%
 
Search WWH ::




Custom Search