This reaction for determining the sequences of peptides and proteins from their N-terminus was reported by Pehr Edman in the early 1950s (1-3). It now constitutes one of the oldest, if not the oldest, molecular method that is still much in use (see Protein Sequencing). It uses a three-stage reaction cycle for step-wise removal of amino acid residues from the N-terminus of a polypeptide chain. The three stages of the Edman cycle (Fig. 1) are (i) coupling, (ii) cyclization, and (iii) conversion.
Figure 1. Principles of the Edman reaction (left). Also shown is the DABITC (right) double-coupling method. Key: PIT phenylisothiocyanate; ATZ, anilinothiazolinone; PTH, phenylthiohydantoin; Dansyl (or DNS), 1-dimethylamino naphtha sulphonyl; DABITC, 4-N,N-dimethylaminoazobenzene-4′ -isothiocyanate.
(i) Coupling involves a nucleophilic attack of the N-terminal a-amino group on the thiocyanate carbon of phenylisothiocyanate (PITC), to form the phenylthiocarbamyl-peptide derivative (PTC-peptide).
(ii) Cyclization constitutes the formation of the anilinothiazolinone derivative (ATZ-amino acid) in anhydrous acid, thereby liberating the N-terminal amino acid residue in a cyclic form, and leaving the remaining peptide truncated by one residue (Fig. 1).
(iii) Conversion involves the rearrangement of the liberated ATZ derivative to the corresponding phenylthiohydantoin derivative (PTH-amino acid) by opening of the CS bond and re-closure with the CO bond.
The reaction of protein amino groups with cyanate was known before Edman’s report, but the degradative yield using cyanate was low. Edman increased the yields to useful levels by introducing thiocyanates in place of cyanates, and by determining the exact reaction conditions and pointing out the need for pure chemicals.
This reaction constitutes the start of protein chemical sequencing on a routine scale; hence it really marks the start of molecular biology as a whole. The reaction is still used in most protein laboratories worldwide, it is the basis of chemical protein sequencers, and a classical method still "modern" in molecular biology at the end of the 1990s. Essentially, the only changes that have occurred during the 50 years to this reaction are:
• use of other solvents, giving alternative and more rapid protocols,
• continued automation, miniaturization, and increase in reaction speed, now making the reaction useful for routine applications at the picomole level in 30-min cycles per residue.
It was a remarkable accomplishment of Edman to design this reaction and to directly set out all the conditions, so that it has stayed unsurpassed for 50 years of worldwide protein chemistry.
1. Protein Sequencers
After development of the Edman reaction, Edman automated the process, developing a machine to make all additions, extractions, and lyophilisations. This constituted the birth of the protein sequenator (or sequencer); the basic concept for a complete machine was published by Edman and Begg in 1967 in a now classic paper (4) in the then newly-started European Journal of Biochemistry.
In the first-generation sequencers, a "spinning cup" constituted the reaction center, and the sample was kept in place by centrifugal force, making extractions and reactions possible. Extraction losses were a problem in each cycle, however, soon leading Laursen and others to the idea of solid-phase attachment of the peptide to be degraded (see Solid-phase synthesis). Consequently, only a few years later, the solid-phase sequencer was reported (5). Although both types of sequencers were based on the same Edman chemistry (3), their different properties, including the different attachments of the samples, made them suitable for different analyses. The liquid-phase sequencer was excellent for proteins, and the solid-phase instrument was better for short peptides, because of their different sensitivities to extractive losses and to build-up of background signals.
2. Manual Edman Degradations
Once the Edman reaction and the protein sequencer existed, the really limiting factor was peptide purification on the one hand, and phenylthiohydantoin (PTH)-amino acid identification on the other. For a long time, suitable chromatographic methods did not exist, hampering both purification and identification. Peptide purification, especially the separation of large peptides, was a problem. Similarly, identification of the PTHs liberated in each Edman cycle was also a problem. The PTH derivatives were initially identified by paper chromatography, which was not fully reliable (all derivatives did not separate in one analytical step) and also required the use of several solvent systems, which wasted time and material. Subsequent developments introduced gas chromatography and thin-layer chromatography, both increasing the speed of identification, but still not giving a reliable one-step identification of all amino acids. These limitations contributed to the development of alternative methods, such as the dansyl and DABITC types of analysis, which did not require a major investment in equipment. One major manual method was that using dansyl (1-dimethylaminonaphthalene-5-sulphonyl) chloride to detect the new N-terminus after each Edman cycle (6) (see Dansyl Chloride). The main advantage was that it was easier to identify the residues released sequentially. Its primary importance is that it made protein sequence determination accessible to many laboratories.
A similar and later development in manual sequence analysis was the DABITC (4-N,N-dimethylaminoazobenzene-4′-isothiocyanate) method (7). In this case, protein degradation is carried out by coupling with the strongly colored DABITC, in place of PITC (Fig. 1). The DABTHs produced, which correspond to PTHs, are easily detectable by thin-layer chromatography (7). However, DABITC has a low coupling yield in the Edman reaction, necessitating a second coupling stage with ordinary PITC (Fig. 1, right) before the subsequent cyclization step.
At the time, both the dansyl and the DABITC methods were important, but since the 1980s these methods have gradually decreased in importance because of reliable sequencer on-line HPLC identification of PTH-derivatives (see below).
The PTH-amino acid identification problem was finally solved with the introduction of HPLC in 1976 (8); subsequently, PTH amino acid identification became rapid, routine, and reliable. Although some identifications may still be difficult, PTH identification no longer constitutes the limiting factor in time or reliability.
At the same time, HPLC separations and the subsequent development of a whole battery of different chromatographic media soon also solved the problem of purification of peptide fragments from proteolytic digests.
4. Second-Generation Chemical Sequencers: Automation of All Steps
With the development of protein sequence analyzers, it became possible to determine amino acid sequences routinely, and there was an exponential increase in known protein sequences. However, analysis at this stage was still time-consuming and nonautomatic, requiring knowledge and real research. Gradually, however, a set of further inventions increased the automation and brought the actual sequence analysis more or less to the present-day automatic stage. This development essentially relied on three further inventions/improvements.
One concerned the attachment of the protein/peptide to the sequencer for analysis. This important step had several sub-steps. One was the realization that an organic cationic polymer, Polybrene (9, 10), was a suitable material capable of binding proteins and peptides to glass surfaces or other membranes. The resulting minimization of extraction losses in the washing steps of each cycle made degradations possible through to the very C-terminus of most peptides. More importantly, apart from the improvement in the lengths of degradation possible, use of Polybrene also meant that the peptide for degradation could be attached to surfaces; this therefore opened the way to the abandonment of the use of centrifugal force for attachment of the peptide for degradation. Instead, peptides could now be attached to membranes, essentially moving the degradation from the traditional "liquid phase" system to the advantageous "solid phase" (which had been started earlier by covalent attachments, cf. above) and "gas-phase" systems (11). The latter are solid phase for the attachment of the peptide to a support, and gas phase for the introduction of some reagents. These approaches are still in use and now allow rapid and sensitive analysis. Many other sub-steps in this transformation were involved. In particular, perhaps the introduction of chemical attachments of peptides to membranes (12), and the successive development of alternative, miniature column attachments for sample introductions and preparations (13), should be mentioned.
The second improvement at this stage, was the introduction of dead volume-free valve blocks, allowing use of valves with an absolute absence of cross-contamination between the reagents used in the reaction (14). In this manner, further increases in speed (because of less washing) and sensitivity (because of higher yield from lack of cross-contamination) made sequencers still more useful and rapid. Recently, this has been carried still further, and soon "chip-based blocks" may be encountered in sequencers, with all solvent delivery and removal stages in extremely small volumes of "chip-blocks" (15).
The third major advance at this stage was the introduction of automatic methods for the conversion of a thiazolinone from each cycle into the corresponding thiohydantoin. This became possible because of the introduction of a second reaction vessel (16), with separate reagents and reactions.
Once the conversion had become automated, it became possible also to link the subsequent PTH identification step to the cycle of automatic events in the sequencer, thus opening the road to on-line identification of the liberated amino acid derivative in each cycle. These on-line modes were started very early, with the introduction of HPLC, and were soon commercialized and perfected in a new set of complete sequencers, starting with the "gas-phase sequencer" that was available in the 1980s (11). Soon, the on-line approach was coupled with post-PTH-identification data treatments, allowing extensive computer interpretation at each step. All chromatograms can now be stored and compared with on-line computers and further interpreted and related to sequences in databanks, analyzed in modeling programs, and submitted to further computerized adjustments.