Toward Understanding the Intelligent Properties of Biological Macromolecules—Implications for Their Design Into Biosensors - Smart Biosensor Technology

Biomedical Engineering Reference

In-Depth Information

sequence variants of DNA-protein complexes may eventually be predictable through sim-

ple calculations. This would be a valuable capability, since the laborious experimental gen-

eration of altered DNA sequences and testing of their DNA-protein stabilities and

properties would not need to be performed prior to their use in biosensors.

1.4

The Importance of Informatics and Data Mining Approaches in

Understanding Biological Macromolecules and in Biosensor Design and

Operation

The discipline of Computer Science provides ever more sophisticated and important soft-

ware applications to all areas of science. Increasingly important are informatics and data

mining approaches applied to the understanding of large complex data sets related to bio-

logical macromolecules' structures, functions, and intelligent properties. This understand-

ing is important both for the macromolecules' potential integration into biosensors as well

as in the analysis of a complex smart biosensors' input data, prior to providing a simple

output report to the user. While the analysis of complex biosensor input is important in

some instances, it represents a situation we did not encounter in our biosensor develop-

ment to date. Therefore, we do not discuss this in great detail. Rather, we illustrate the

importance of these data mining techniques by discussing two representative examples

where complex high-dimensionality biosensor data can only be properly addressed using

an informatics and data mining approach involving machine learning techniques.

1.4.1

Machine Learning Approaches

A number of the machine learning techniques that are sometimes used in the analysis of

biosensor output are also important in the analysis of large multidimensional, nonlinear

biological, biochemical, and chemical datasets. Such analyses can be used to understand

the intelligent properties of specific biological macromolecules as well as to incorporate

them into the design of biosensors. The use of informatics and data mining activities

applied to data sets in these domain areas have increasingly been termed, respectively,

bioinformatics and cheminformatics. The latter term has overlap with and has somewhat

superseded an older term, chemometrics, in the chemical literature.

Machine learning is the derivation of general knowledge from specific data sets using

statistical techniques and analytical algorithms. The data sets are searched using potential

hypotheses with the goal of building descriptive and predictive models general enough

that they may allow one to draw similar conclusions from other related data sets. Machine

learning can either involve classification techniques using supervised learning methods or

clustering techniques in the case of unsupervised learning methods. Supervised machine

learning methods can be utilized in the analyses of complex biosensor data where classes

of analytes are already known and are used to train the entire system of biosensor inputs.

Thus, machine learning methods could be used to recognize the biosensor input signal

attributes that most accurately identify the correct analyte from among many incorrect

possibilities. The resulting learned combination of input signal attributes that accurately

determine the analyte identity and concentration would then be applied to unknown

samples, to provide the routine biosensor output to the end user. Classification techniques

vary from simple testing of sample features for statistical significance to sophisticated

probabalistic modeling techniques. Some algorithmic methods widely used in machine

learning approaches include: Naive Bayes, neural networks, support vector machines,

instance-based learning ( K -nearest neighbor), and logistic regression, to name a few (173).

Smart Biosensor Technology

Search WWH ::

Custom Search

Home