Biomedical Engineering Reference
In-Depth Information
to be used to predict protein structure from sequence data, then an underlying assumption is that the
data on amino acid sequence, bond length, bond angles, and related atomic-level data are not only
available, but the data are accurate to some verifiable level.
Given the underlying data and a conceptual model, the next phase of the modeling and simulation
process is translating the conceptual model into data structures and high-level descriptions of
computational procedures. Designing the computer model involves extracting from the conceptual
model only those characteristics of the original system that are deemed essential, as determined by
the model's ultimate purpose. For example, the purpose of predicting protein structure from
sequence data may be to allow the end-user to visualize the protein structure, so that a high degree
of accuracy isn't that essential. In this example, the purpose of the model is to simplify and idealize,
and the characteristics selected from the conceptual model should reflect this purpose.
Designing the computer model, like defining the problem space and conceptual modeling, is largely
an art. Designing a simple model that adequately mimics the behavior of the system or process
under study is a creative process that incorporates certain assumptions. The art of making good
assumptions may well be the most challenging component of modeling, considering success depends
as much on the domain experience of the modeler as it does on the nature of the system to be
modeled. Biological systems are seldom presented in a quantitative manner, often requiring that the
model designer derive or invent the needed mathematical formalisms or heuristics.
Coding of the computer model involves transferring the symbolic representations of the system into
executable computer code. Model coding marks the transition of the modeling process from an
artistic endeavor to a predominantly scientific one, defined by software engineering principles. Model
coding may involve working with a low-level computer language, such as C++, or a high-level shell
designed specifically for modeling and simulation.
Once a model is in the form of executable code, it should be subject to verification and validation.
Verification is the process of determining that the model coded in software accurately reflects the
conceptual model by testing the internal logic of a model to confirm that it is functioning as intended,
for example. The simulation system and its underlying model are validated by assessing whether the
operation of the software model is consistent with the real world, usually through comparison with
data from the system being simulated. For example, in a system designed to predict protein
structure, the validation process would include comparing model data with protein structure data
from NMR and X-ray crystallography. Validating X-ray crystallography data might involve comparing
it with the pattern resulting from bombarding the crystal lattice of a purified protein with X-rays. In
contrast, validating NMR data might involve comparing it with actual data produced by scanning a
pure protein in solution.
Validation also involves certifying that the output of the system as a whole is adequate for the
intended purpose and is consistent with the presumptions of expert opinion. As such, validation is at
least in part a subjective call. The validity of a model is a function of the objectives of the model
designer and the context of its application. For example, the usefulness of a model of protein
structure for a decision-making application is a function of the accuracy of prediction. There are no
concepts such as "best" or "correct" in model validity assessment, considering that the degree to
which a model needs to reflect or mimic a real-world system varies with each case. In addition,
because verification is a check for internal consistency, it's possible for a model to be verifiable and
yet fail validation because of errors in the conceptual model.
Executing the simulation ideally generates the output data that can illustrate or answer the problem
initially identified in the problem space. Depending on the methods used, the amount of process and
time required to generate the needed data may be extensive. For example, predicting protein
structure using ab initio methods can involve thousands of iterations and take days of supercomputer
time in order to arrive at statistically reliable results.
Visualizing the output data opens the simulator output to human inspection, especially if the output is
in the form of 3D graphics that can be assessed qualitatively instead of in tables of textual data. For
example, even though the structure of a protein may be described completely in a text file that
follows the PDB format, the data take on more meaning when they can be visualized as a 3D
Search WWH ::




Custom Search