Information Technology Reference
In-Depth Information
80%, the false alarm rate is just about 12%. The
predictive power of the ROC curve on the testing
data set is lower than for the training set, but still
powerful. For example, using the same cutoff
point, when the detection rate is 80%, the false
alarm rate is 20%. In other words, the machine can
predict 80% of the high quality pages correctly if
we allow 20% of the ordinary pages to be incor-
rectly classified as high quality pages.
We were far from exhausting all the possible
use of the predictive variables in this pilot study,
for examples, we did not use any normalization
techniques, and we did not apply techniques like
principal component analysis to form composite
variables. More importantly, we only have an
intuitive understanding, but not a solid theory,
to explain the predictive power of the variables.
With a good theory, not only we can identify
hidden predictive variables more easily, we can
also create new composite variables to improve
prediction power. This will be the next step of
this study.
ConClusion
The importance of the above ROC curves lies in
two facts:
reFerenCes
1.
It permits system architects to select thresh-
olds for classification that are appropriate to
their own assessments of the relative cost
of misclassifying good quality pages and
ordinary pages.
Cress, U., & Kimmerle, J. (2008). A systemic
and cognitive view on collaborative knowledge
building with wiki. In Computer-Supported Col-
laborative Learning (3) , (pp. 105-122). Berlin:
Springer.
2.
The relative strength of different predict-
ing algorithms (e.g, the odd estimated by
stepwise logistic regression reported above)
can be read easily from the ROC, since a
prediction algorithm whose ROC curve lies
always above that of another algorithm will
be superior to it no matter what the user's
specific estimates of costs and values may
be. In other words, we can use this study as
a baseline for further investigation of other
predictive algorithm and compare their per-
formance using the ROC.
Desilets, A., Paquet, S., & Vinson, N. G. (2005).
Are wiki usable? WikiSym 2005 - Conference
Proceedings of the 2005 International Symposium
on Wikis , (pp 3-15).
Ebersbach, A., Glaser, M., Heigl, R., & Warta,
A. (2008). Wiki Web Collaboration, 2 Ed. Berlin:
Springer-Verlag.
Egan, J. P. (1975). Signal detection theory and
ROC analysis . New York: Academic Press.
Eppler, M. J., & Wittig, D. (2000). Conceptual-
izing information quality:A review of information
quality frameworks from the last ten years. In
B. D. Klein, D. F. Rossin, (Ed.), Proceedings of
the 2000 conference on information quality , (pp.
83-96). Cambridge, MA: Massachusetts Institute
of Technology.
In this study, we proved that one could use log
data and textual features to automatically estimate
the quality of contents built by collaborative
knowledge building. This is a very important and
necessary mechanism for a collaborative knowl-
edge building system when its size becomes so
large and its grow rate becomes so fast that no one
can catch up with all the addition and modification,
and to keep the overall quality of the contents to
be satisfactory.
Search WWH ::




Custom Search