Database Reference
In-Depth Information
Chapter 1
Introduction
The significant advances in data collection and data storage technologies have pro-
vided the means for the inexpensive storage of enormous amounts of transactional
data in data warehouses that reside in companies and public sector organizations.
Apart from the benefit of using this data per se (e.g., for keeping up to date profiles
of the customers and their purchases, maintaining a list of the available products,
their quantities and price, etc), the mining of these datasets with the existing data
mining tools can reveal invaluable knowledge that was unknown to the data holder
beforehand. The extracted knowledge patterns can provide insight to the data hold-
ers as well as be invaluable in important tasks, such as decision making and strategic
planning. Moreover, companies are often willing to collaborate with other entities
who conduct similar business, towards the mutual benefit of their businesses. Sig-
nificant knowledge patterns can be derived and shared among the partners through
the collaborative mining of their datasets. Furthermore, public sector organizations
and civilian federal agencies usually have to share a portion of their collected data or
knowledge with other organizations having a similar purpose, or even make this data
and/or knowledge public in order to comply with certain regulations. For example,
in the United States, the National Institutes of Health (NIH) [2] endorses research
that leads to significant findings which improve human health and provides a set
of guidelines which sanction the timely dissemination of NIH-supported research
findings for use by other researchers. At the same time, the NIH acknowledges the
need to maintain privacy standards and, thus, requires NIH-sponsored investigators
to disclose data collected or studied in a manner that is “free of identifiers that could
lead to deductive disclosure of the identity of individual subjects” [2] and deposit it
to the Database of Genotype and Phenotype (dbGaP) [45] for broad dissemination.
Another example of the benefits of data sharing comes from the business world.
Wal-Mart, a major retailer in the United States, and Procter & Gamble (P&G), an
international manufacturer, decided in 1988 to share information and knowledge
across their mutual supply chains in order to better coordinate their common activ-
ities. As Grean & Shaw discuss in [31], the partnership of the two companies led
to the improvement of their business relationship, reduced the needs for inventories
thus driven down the associated costs, and led to a high increase in the joint sales of
Search WWH ::




Custom Search