Database Reference
In-Depth Information
computations to map phase stability in binary and ternary alloys. 62 - 72 This
led the development of computationally derived phase diagrams, which is a
classic example of integrating information in databases to data models. The
evolution of both databases has occurred independently, although in terms
of their scientific value, they are extraordinarily intertwined. Phase diagrams
map out regimes of crystal structure in temperatures-composition space or
temperature-pressure space. Yet, crystal structure databases have been devel-
oped totally independently. At present, the community has to work with each
database separately, making information searches cumbersome and the inter-
pretation of data analysis involving both databases very dicult. Researchers
only integrate such information on their own for a very specific system at a
time, based on their individual interests. Hence, there is at present no uni-
fied way to explore patterns of behavior across both databases, which are so
scientifically related.
One of the more systematic efforts to address this challenge has been that
of Ashby 73 - 77 who showed how, by merging phenomenological relationships
in materials properties with discrete data on specific materials characteris-
tics, one can begin to develop patterns of classification of materials behavior.
The visualization of multivariate data was managed by using normalization
schemes that permitted the development of “maps” that provided a means of
capturing the clustering of materials properties. It also provided a method-
ology to establish common structure-property relationships across seemingly
different classes of materials. This approach, while very valuable, is limited in
its predictive value and is ultimately based on utilizing prior models to build
and seek relationships.
In the informatics strategy of studying materials behavior, we are approach-
ing the problem from a broader perspective. By exploring all types of data
that may have varying degrees of influence on a given property (or properties)
with no prior assumptions, one utilizes data-mining techniques to establish
both classification and predictive assessments in materials behavior. This is
not done, however, from a purely statistical perspective. Instead, we carefully
integrate a physics-driven approach to data collection with data mining, and
then validate or analyze with theory-based computation and/or experiments.
The origins of the data can be either from experiment or computation; the
former, when organized in terms of combinatorial experiments, can provide an
opportunity to screen large amounts of data in a high throughput fashion. 78 - 82
In the following discussion, we provide an example of both the classification
and predictive uses of data mining in materials science.
8.5.1 Data-Dimensionality Reduction: Classification
and Clustering Applications
Figure 8.5 shows a three-dimensional principal component analysis (PCA) plot
of a multivariate database of high-temperature (Tc) superconductors. The ini-
tial dataset consisted of 600 compounds (that is, the rows of our database)
Search WWH ::




Custom Search