The Impact of JDM
on IT Infrastructure
I think there is a world market for maybe five computers.
—IBM Chairman Thomas Watson, 1943
Data mining can have a great impact on the infrastructure of
information technology (IT). Mining in the small, that is, using
small local datasets and maintaining a few models for short-term
use, is unlikely to tax the IT environment. However, mining in the
large —with multi-gigabyte and sometimes multi-terabyte datasets,
hundreds or thousands of frequently changing datasets, obtaining
datasets across the enterprise or beyond, and managing hundreds
or thousands of models for deployment throughout the enter-
prise—places new demands for backup and recovery, data access,
data staging, and ensuring proper levels of service. Moreover, data
mining is often part of a larger business process requiring plugging
into existing scheduling or workflow systems. There are also differ-
ences in how IT approaches database and non-database data
mining engines (DMEs).
This chapter explores the impact that data mining can have on
the IT infrastructure and what data and process administrators need
to consider when mining in the large.