Biomedical Engineering Reference
In-Depth Information
format any time soon, many have the ability to export data into common
open formats. There are also a number of converters available [7, 8], but
care must be taken to preserve the integrity of the data.
MzXML [9], mzData and mzML [10, 11] are open, XML (eXtensible
Markup Language)-based formats originally designed for proteomics mass
spectrometric data. mzData was developed by the HUPO Proteomics
Standards Initiative (PSI), whereas mzXML was developed at the Seattle
Proteome Center. In an attempt to unify these two rival formats, a new
format called mzML has been developed [11]. A further open format is
JCAMP-DX [12], an ASCII-based representation originally developed for
infra-red spectroscopy. It is seldom used for MS due to the size of the data
sets. Finally, ANDI-MS is also open, and based on netCDF, a data interchange
format used in a wide variety of scientifi c areas, particularly geospatial
modelling. It is specifi ed under the ASTM E1947 [13] standard. These are
complemented by the wide variety of vendor-led formats in routine use in
the fi eld, the most common of which are .raw (Thermo Xcalibur, Waters/
Micromass MassLynx or Perkin Elmer); .D (Agilent); .BAF, .YEP and
.FID(Bruker); .WIFF (ABI/Sciex) and .PKL (Waters/Micromass MassLynx).
The remainder of this chapter will demonstrate the use of open source
tools to typical mass spectrometry situations.
4.4.2 Analysing mass spectroscopy data
using R
A number of tools for mass spectrometry have been written in the R
language for statistical computing [14]. Versions are available for Linux,
Mac or Windows, making it compatible with a broad range of computing
environments. One of the most convenient ways to run R is to use RStudio
( http://rstudio.org /), which serves as integrated development environment.
This software allows the editing of scripts, running commands, viewing
graphical output and accessing help in an integrated system. Another very
useful feature of R is the availability of 'packages', which are pre-written
functions that cover almost every area of mathematics and statistics.
￿ ￿ ￿ ￿ ￿
4.4.3 Obtaining a formula from a given mass
Our fi rst example uses a Bioconductor package for the R language written
by Sebastian Böcker at the University of Jena [15, 16]. The package uses
various chemical intelligence rules to infer possible formula from a given
 
Search WWH ::




Custom Search