Biomedical Engineering Reference
In-Depth Information
the peaks are aligned using the RANSAC [28] Peak aligner, which is a
method to join all the separate peak lists into one master list, accounting
for both linear and non-linear deviations in retention time. An alternative
is the simple Join aligner, which uses mz and RT windows. Because many
small, possibly spurious, peaks may be detected in single runs, the
combined table can be constrained to entries where there are a minimum
of, say, 20 occurrences of that peak. Figure 4.10(c) shows how this is
confi gured in mzMine. Finally (Figure 4.10(d)), in order to identify the
peaks, a custom database of accurate mass/retention times measured on
standard compounds is used. This library is simply a comma separated
value fi le (.CSV) listing the mz, RT, molecular formula and name of each
metabolite. The retention times are determined by the previous injection
of standard samples onto the system. There are also options to search
online databases such as ChemSpider, KEGG, METLIN, etc., but the hits
are often rather promiscuous returning many research chemicals, drugs
and mammalian metabolites. These may be irrelevant and misleading
when the experiment concerns a limited, defi ned space, such as plant
metabolites for example.
Once all the stages are confi gured satisfactorily it is possible to run the
operations in batch mode. This can take some time and having a multicore
processor is useful as mzMine is multithreaded. For the small example
data set illustrated, this operation took approximately 5 minutes (PC =
HP Zeon Z600 8- core 2.4 GHz workstation with 8 GB RAM running
Windows 7, 64 bit). It is not uncommon in our laboratory to run analyses
that take many hours of overnight operation for a typical metabolomic
study. The fi nal step in the workfl ow is an Export to CSV option that
allows the export of the fi nal spreadsheet for downstream analysis.
The end result of the data processing workfl ow is shown in Figure 4.11
'RANSAC Aligned min 20 peaks'. Peaks that are missing are shown as red
spots in the table (shown boxed in the fi gure). As missing data is undesirable,
mzMine can be confi gured to fi ll missing peaks using the regions defi ned in
the peak table. This ensures a reading of real data which is preferable for
later statistical analysis. mzMine has two main gap-fi lling options; 'Peak
fi nder' and 'm/z and RT range gap fi ller'. The former looks for undetected
peaks in the same region as other scans, whereas the mz and RT gap fi ller
simply fi nds the highest data point within the defi ned range.
At this stage it is most likely that the data will be processed further in
a commercial data analysis package, but there are a few basic data
visualisation tools included in mzMine. Analysis options include:
coeffi cient of variation (CV) analysis, log ratio analysis, principal
component analysis, curvilinear distance analysis, Sammon's projection
￿ ￿ ￿ ￿ ￿
 
Search WWH ::




Custom Search