Core methodologies (Proteomics)

There is no science without fancy, and no art without facts - Vladimir Nabokov

These are exciting times for proteomics. While technological progress at the frontiers of the science continues unabated, the basic ability to identify proteins in a mass spectrometer justifies much of the fuss. This section describes techniques that form the “bread and butter” of proteomics – techniques that are within the reach of many laboratories and are increasingly provided as “core facilities” within research institutes. Indeed, one of the major challenges is not merely technical but to communicate to the general research community exactly what proteomics can and cannot do. In this respect, Section 1, “Core Methodologies”, goes some way to expounding the state of the art. Most of the methods described can be considered routine in a proteomics lab, yet they greatly extend the power of virtually any type of experiment involving proteins. The significance of protein and peptide ionization methods coupled to mass spectrometry (MS) was recognized by the award of the Nobel Prize for Chemistry in 2002 to John Fenn and Koichi Tanaka. But before the late 1990s, proteins were almost exclusively identified using antibody-based methods such as ELISA and western blot, or using Edman degradation for protein microsequencing.

Nowadays, researchers who can accurately and rapidly identify proteins have three very powerful advantages in pursuing their research goals. First, because protein identification can be achieved in a semiautomatic manner, the type of project that can be undertaken is greatly extended. Examples in this and subsequent sections range from the analysis of chemical modification of single proteins, to the characterization of protein complexes isolated by affinity precipitation, to examining the proteome dynamics of whole cells under conditions of health and disease. These studies are possible partly because the feedback cycle between experiment and result has been dramatically shortened with proteomics. It is therefore feasible to develop a new model in a reasonable amount of time, or to analyze more samples under more conditions in order to give greater statistical power. Second, researchers are not limited to identifying only those proteins for which tests (or antibodies) are available. Even where tests are available, they may not be uniformly sensitive or robust, making large-scale or parallel studies problematic or impossible. Meanwhile, although the goal of comprehensive genome-wide qualitative and quantitative analysis (already realized for mRNA) is not yet a reality for proteins, recent studies in organisms such as yeast and the malaria parasite can clearly be described as “systemwide”, with thousands of proteins being identified. A third advantage of proteomics is that all or most of the proteins in a sample can be identified, not just those specifically chosen for testing according to some criteria. This opens the door to chance discovery. Proteomics studies, in common with other so-called omics approaches, have sometimes been criticized for lacking central hypotheses, and these criticisms deserve to be taken on board. In the most satisfying work however, experiment design that directly addresses a hypothesis has been combined with the possibility of making new discoveries. In other words, hypothesis testing and discovery science need not be incompatible. If “finding out new things” is what makes for successful science, then we can expect many more successes from proteomics.


These successes are being driven by machines, mass spectrometers, whose origins, like much of molecular biology itself, can be traced to pioneering physicists like Joseph John Thompson and Francis William Aston at the beginning of the twentieth century. It would be a disservice to the many outstanding intellectual achievements to attempt to summarize the history of MS (mass spectrometry) here, but the key methods contributing to the analysis of peptides and proteins followed the development of soft desorption and ionization techniques, particularly, matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI). These methods allow large biomolecules such as peptides to enter a mass spectrometer. Getting biomolecules into the instrument as intact charged species was the major barrier to the development of the field. Both methods are discussed in these pages (see Article 9, Quadrupole ion traps and a new era of evolution, Volume 5), and it is interesting that both continue to be widely used with no single ionization method dominating to date. One may generalize and say that MALDI is more amenable to automation and high-throughput experimentation, while ESI has been adapted very successfully to separation methods such as liquid chromatography. Almost by definition, a scientist engaged in a proteomics experiment will be dealing with mixtures of proteins of some complexity, and in general, the complexity needs to be reduced before or during the ionization process. This is because of limits resulting from design of the various instruments, for instance, the duty cycle, the dynamic range, and signal saturation.

The complexity of proteomics samples is addressed through a mixture of sample preparation and component separation. Here, an overview of sample preparation is provided (see Article 2, Sample preparation for proteomics, Volume 5), including proteolytic digestion with trypsin, a very important method in most proteomics laboratories (see Article 16, Improvement of sequence coverage in peptide mass fingerprinting, Volume 5). Strategies to separate proteins may be applied to intact proteins or to their proteolytic digests. Often, separation by one feature of the chemistry of proteins or peptides is sufficient, but the complexity of many proteomic samples is such that at least one additional feature must be used. Two-dimensional gels (see Article 29, Two-dimensional gel electrophoresis (2-DE), Volume 5) separate proteins according to their isoelectric points and mass for instance, while the MUDPIT technique (Multidimensional Protein Identification Technology) separates peptides according to their charge and hydrophobicity (see Article 13, Multidimensional liquid chromatography tandem mass spectrometry for biological discovery, Volume 5). Many variations are possible, constrained only by the need to keep the peptides solvated and ultimately to ionize them. Capture technologies that trap and separate proteins or peptides on the basis of the presence of moieties such as carbohydrate or phosphate groups have been described, and in principle, almost any type of chemistry that can separate proteins or peptides on the basis of their distinct chemistries can be exploited. However, unlike say oligonucleotides, proteins are chemically heterogeneous, and certain classes of proteins, while very important biologically, are difficult to adapt to normal sample preparation protocols. Here strategies that may be used to prepare membrane proteins for MS are described (see Article 15, Handling membrane proteins, Volume 5).

One trend that has been consistent across all types of proteomics technologies is miniaturization. This has been driven by factors such as the need to analyze thousands of proteins in parallel, limited amounts of sample, economics, and especially in the case of mass spectrometry, by the need for increased sensitivity. The development of micro- and nanoscale liquid chromatography was especially important, and both Nano-MALDI and Nano-ESI MS are described here (see Article 11, Nano-MALDI and Nano-ESI MS, Volume 5). A very useful description of making nanocolumns and tips is also included (see Article 19, Making nanocolumns and tips, Volume 5). This method can lead to very significant gains in the sensitivity of electrospray experiments with complex peptide mixtures. An important new technique for analysis at the cell and tissues level is laser-based microdissection (see Article 6, Laser-based microdissection approaches and applications, Volume 5), and this approach is likely to have important clinical research applications.

The range of different mass spectrometers can be overwhelming to the newcomer (and latecomer!). Both traditional and emerging aspects of the workhorse instruments are treated here: time-of-flight MS (see Article 7, Time-of-flight mass spectrometry, Volume 5), quadrupole MS (see Article 8, Quadrupole mass analyzers: theoretical and practical considerations, Volume 5), and quadrupole ion trap MS (see Article 9, Quadrupole ion traps and a new era of evolution, Volume 5). The most useful features of different architectures have been intelligently combined to create hybrid instruments (see Article 10, Hybrid MS, Volume 5). Similarly, the use of the extremely powerful class of Fourier transform ion cyclotron instrument is becoming more widespread (see Article 5, FT-ICR, Volume 5). An important aspect of these multistage MS techniques is the use of instrument software that automates the process. This is particularly the case for tandem mass spectrometry experiments where the ions chosen for fragmentation following a primary “scan” can be chosen using predetermined rules, eliminating the need for constant supervision of the experiment. As discussed (see Article 18, Techniques for ion dissociation (fragmentation) and scanning in MS/MS, Volume 5), tandem mass spectrometry is a major means of acquiring structural information about pep-tide molecules, particularly, the amino acid sequence. Normally associated with quadrupole or ion trap instruments, the somewhat related phenomenon of post-source decay means that this form of structural analysis can now be carried out by time-of-flight instruments, considerably extending its application.

However, it is often forgotten that the power of proteomics is not fully realized solely by a powerful instrument, or even by an optimal combination of separation method, ionization method, and instrument. Without the automated methods for interpreting mass spectra, these protein “signatures” would accumulate many times faster than the ability of skilled workers to determine what they represent. The development of algorithms and programs that assist the process of identifying proteins from their cognate mass spectra was therefore another key technological breakthrough, allowing the science of proteomics to evolve. Several workers in the early 1990s independently devised methods for using publicly available DNA sequence databases to reduce the task of interpreting protein and peptide mass spectra to a tractable problem. Although similar in principle, the approaches of peptide mass fingerprinting (see Article 3, Tandem mass spectrometry database searching, Volume 5) and tandem mass spectrum interpretation (see Article 4, Interpreting tandem mass spectra of peptides, Volume 5) require different programs. Tutorials describing both approaches included here will be very helpful (see Article 17, Tutorial on tandem mass spectrometry database searching, Volume 5).

The observation at the top of this text was made by Nabakov during an interview in 1966. Many of us are fortunate to arrive now onto a scene where approaches that required highly original thought and effort to develop are sufficiently embedded that we may cast our proteomic nets across the water in expectation of a plentiful catch. The work of these pioneering scientists should be applauded, as should the authors of the chapters following, many of whom are one and the same. To them we owe the ability of proteomics to combine fancy and facts.

Next post:

Previous post: