The Scripps Research Institute

Novel Amino Acids Come of Age

Some time in the few hundred million years between the formation of earth's crust and the emergence of earth's oldest known cells, cyanobacteria, protein synthesis evolved. Whether it evolved in cyanobacteria 3.5 billion years ago or in some earlier precursor decades before is unknown. But we do know that the proteins were synthesized using only 20 amino acids because all higher organisms that have come after have used the same 20.

"Why do all forms of life we know use the same common 20 amino acids and only 20 (seleno cysteine excluded)?" asks Peter Schultz, professor of chemistry and Scripps Family Chair of The Skaggs Institute of Chemical Biology. "Can we change this by adding new amino acids to the genetic code and can we use these new amino acids to change the structures and functions of proteins in interesting ways?"

The answer to his last three questions is now, apparently, yes.

About 15 years ago, when Schultz was at the University of California at Berkeley, he began inserting novel amino acids into proteins in vitro, and about five years ago, he began working on a project to do it in vivo. "We are interested in proteins because, as chemists, the most fascinating class of molecules are proteins, which have functions ranging from photosynthesis to signal transduction to gene regulation." says Schultz.

"And," he adds, "when chemists look at molecules, they say, 'How can we better understand how they work, and how can we rationally manipulate their structure to create interesting new functions?'"

Schultz reckoned that it would be possible to enhance the chemical, physical, and biological properties of proteins by adding novel amino acids—ones that are not among the 20 that all living organisms use. He wanted a way to do this easily and in vivo, because direct chemical or biochemical synthesis of a protein containing unusual amino acids, while possible, is limited, laborious, and low-yielding.

"To make proteins in a robust way, one has to do it inside cells—it's difficult to synthesize proteins in a test tube," says Schultz.

Success Story

Schultz, his student Lei Wang, and their colleagues announced in a Science paper earlier this year that they had succeeded in adding the new amino acid O-methyl-tyrosine to the genetic code of E. coli. O-methyl-tyrosine is the first of several new amino acids they are working on site-specifically inserting in vivo—a proof of principle.

Also called "unnatural" because they are not among nature's original 20, these novel amino acids have the same carboxy–amino backbone as the 20 standard amino acids but different side chains. Some have just slightly altered chemical structures and others have new functional groups added. In proteins, these differences may alter everything from structure and folding to activities. Certain "designer" side chains may even impart novel functionality.

With his new methodology, Schultz believes it will not be long before this technology is here. "It should be possible to put in a fluorescent probe or a photoaffinity label at any site on any protein in the cell," says Schultz.

The idiom, "You can't teach an old dog new tricks" doesn't even come close to describing how difficult it should have been to make organisms incorporate novel amino acids into their protein chains. After all, every organism in nature has been using the same 20 amino acids since the primordial soup. Organisms have to maintain fidelity in replication, and so life has evolved myriad mechanical ways of making sure only those 20 amino acids get incorporated into proteins.

Schultz and his colleagues did it anyway.

Working the Code

When a protein is expressed, an enzyme reads the DNA bases of a gene (A, G, C, and T), and transcribes them into RNA (A, G, C, and U). This so-called "messenger RNA" is translated by another protein–RNA complex, called the ribosome, into a protein. The ribosome requires the help of transfer RNA molecules (tRNA) that have been "loaded" with an amino acid. Protein specificity comes from the fact that the tRNA recognizes only one codon and gets loaded with only the one amino acid that is specific for that codon.

For every codon on the mRNA—every three bases—the ribosome attaches another amino acid to the chain. But even though there are 4x4x4 = 64 different codons (UAG, ACG, UTC, etc.) there are only 20 amino acids that all organisms use to produce proteins. Some of the 64 codons are redundant, with several coding for the same amino acid, and three of them are nonsense codons—they don't code for any amino acids.

These nonsense codons are useful because normally when a ribosome that is synthesizing a protein reaches a codon that does not code for any tRNA, the ribosome dissociates and synthesis stops. Hence nonsense codons are also referred to as "stop" codons. One of these, called the amber stop codon, UAG, played an important role in Schultz's research.

If the cell is provided with an amber suppressor—a tRNA that recognizes UAG—then the ribosome will grow the chain with the amino acid that this new tRNA carries and synthesis does not stop. Any codon in an mRNA that is switched to UAG will carry the new amino acid in that place and the ribosome will resume reading the full-length message.

Site-directed mutagenesis is an application of amber suppression, and by simple extension, a tRNA that recognizes UAG and carries a novel amino acid could be used to site-specifically incorporate novel amino acids. If he could create an amber suppressor tRNA that carried a novel amino acid and an enzyme to load the amino acid on that tRNA, Schultz knew he would have the ability to site specifically add the novel amino acid to a growing protein chain wherever he inserted the codon UAG.

All he needed was a tRNA unique to that codon and a synthetase "loading" enzyme to load the unnatural amino acid on that tRNA. Then he would have a simple and robust way to put a completely original residue anywhere he wanted in a protein.

99.99 Percent Fidelity

This was not easy. "It's hard to find synthetases that recognize novel amino acids because nature has designed them to recognize only the natural ones," says Research Associate Steve Santoro, a TSRI graduate. And overcoming that obstacle, there is also the requirement that the new tRNA/synthetase pair be functionally orthogonal, meaning the new pair must not interact with any of the existing tRNAs or synthetases.

"The new tRNA should not be aminoacylated by any existing synthetase, and likewise, the new synthetase should not recognize any existing tRNA," says Wang, who has been working on the project from its early beginnings. "There should be no crosstalk."

The orthogonal synthetase had to be engineered so that it charged the orthogonal tRNA with an unnatural amino acid but not any natural amino acids and do so with a fidelity that is as high as normal—99.9 percent or higher—something that proved difficult.

"That was the goal," says Schultz, "though we thought we wouldn't get there in ten years."

It took Schultz and his team approximately 5 years and many dead ends to solve the problem of generating an orthogonal tRNA/synthetase pair. They then tackled the job of altering the synthetase/amino acid specificity using a number of different in vitro evolution strategies. Recently they hit on a general method which after a few rounds of molecular evolution allowed Wang to generate a synthetase that loaded an orthogonal tRNA with 99.99 percent fidelity. "It worked surprisingly well," says Schultz.

"You just need to add the novel amino acid to the culture and grow the cells," says Wang.

Using this method, they incorporated O-methyl-L-tyrosine into proteins with fidelity greater than 99 percent, which is close to the translation fidelity of natural amino acids. "We really wanted high fidelity, and we thought if we could get it with O-methyl-tyrosine or tyrosine or phenylalanine, then one could probably get high fidelity with [almost] any amino acid," says Schultz.

Some wondered whether the feat could be repeated. One of the reviewers of the paper told Schultz that the result was amazing but a dead end. "He said O-methyl worked because it is very similar to tyrosine and we would never be able to do it again ," says Santoro.

"The next day," says Schultz, "Lei came in and he had just added [a] napthyl alanine to the code." Since that day, Schultz's group has added a host of novel amino acids to the E. coli genetic code.

Novel X and Four-Base Codons

There are novel amino acids that contain fluorescent moieties that are smaller than green fluorescent protein. These can be used in place of GFP to label proteins and observe them in vivo. Others novel amino acids useful as molecular probes have side chains that can be phosphorylated or that contain spin labels.

There are novel hydrophobic amino acids, which should be useful for probing structure, and novel nucleophiles, heavy metal-binding amino acids, and photoisomerizable side chains, all of which should confer new activity to the proteins. There are novel glycosylated amino acids that could be used to make therapeutic proteins, and there are novel amino acids with keto groups that can be used to selectively label proteins with practically any molecular group of interest.

There are also groups that contain photoaffinity labels that could be used for covalently cross-linking proteins to one another in a photoinduction proteomics experiment.

"The idea," says Postdoctoral Fellow Jason Chin, "is that you will put photo-crosslinking groups into a specific site in a protein. You could then see what the protein interacts with in living cells. And you will be able to look at weak interactions that are difficult to detect by current methods."

Putting in many different novel amino acids is the next step, and Schultz is working on generalizing the method used for the O-methyl-tyrosine so that he can routinely do this.

Schultz believes the key to easily inserting a new amino acid is changing the specificity of the aminoacyl-tRNA synthetase. Schultz describes the process as "gutting" the active site—putting in a hole that can be filled with a combination of a new amino acid and a new protein side chain. However, not all pegs fit in the same hole, and some synthetases will not be able to take certain novel amino acids. But by making several of these tRNA/synthetase pairs, it should be possible to put in almost any amino acid.

One pair that is underway is the leucine synthetase pair, which Schultz and his graduate student Christopher Anderson are working on now. This pair is interesting because it may be used to expand the technology beyond the amber codon, which, though successful and robust, is limited. "That only lets you use one or potentially two [different] amino acids [per protein]," says Schultz.

"We're developing the leucine synthetase system to recognize a four-base codon," says Anderson.

The difficulty, though, is that most anticodon loops are key recognition elements of the synthetase, and this recognition becomes intolerably perturbed due to the structure of the four-base tRNAs. "You have to be able to change the anticodon loop of the tRNA and still be able to have the synthetase recognize the tRNA," says Anderson. In the leucine synthetase system this is not a problem because the synthetase recognition occurs at another site.

The strategy involves using molecular evolution experiments to select for tRNAs with anticodon loops that recognize four or five bases. The advantage of using the longer codon is diversity—there are 256 four-base codons possible, for instance, and many of these can be re-assigned to a new unnatural amino acid. They are now building new tRNA/synthetase pairs that decode four bases at a time.

Life with Many Amino Acids

Another interesting question the group is working out is how to transfer the technology to eukaryotic cells. "Right now all this work has been done in E. coli," says Santoro. "It would be much more interesting to be able to express proteins containing unnatural amino acids in mammalian cells."

" The ability to do unnatural cell biology by introducing unnatural amino acids such as flurophores and photocrosslinkers into proteins in eukaryotic cells will provide a powerful arsenal of tools to dissect and understand how these cells and even whole organisms work," says Chin, who is working on expanding the eukaryotic code. "We will be able to probe protein interactions involved in human disease with unprecendented precision in living cells."

Another follow up project applies the technology in a random way, adding novel amino acids to the genetic code of cells, allowing the cells to use them. The team will see how these changes affect the organism's ability to, say, evolve in response to stress.

To answer this, Schultz and his colleagues plan to do a random unnatural amino acid mutagenesis of the entire E. coli genome and subject the new cells to some sort of selective pressures—like heat or antibiotics—and see what happens. Could a 21-amino-acid bug be more robust than a 20-amino-acid bug?

"If you put that cell under some stress, will the cell be able to evolve faster to deal with that stress?" asks Schultz.

A further application of this line of research that Schultz is contemplating is to randomly mutate proteins with several unnatural amino acids in many places simultaneously. And yet another possibility would be to add the metabolic pathway for the synthesis of the novel amino acid to the cells so that it would not have to be added to the growth medium—something that Wang, Anderson, and Zhang are addressing at the moment.

Polyester Proteins

Schultz and his team are also asking whether we need amino acids in the first place. Can you use some other polymer building block, like hydroxy acids, as a protein constituent and will the resulting protein fold?

Whereas novel amino acids differ in side chain chemistry but have the same amide backbone as natural amino acids, hydroxy acids would have a completely different backbone structure. Instead of a polyamide, they would form polyesters.

One of the first things they will try is simply to get a single hydroxy acid incorporated site specifically, using the same technology they worked out for the O-methyl-tyrosine system. Then with this proof of principle out of the way, they will proceed to trying to use two or more tRNA/synthetase pairs that incorporate hydroxy acids that are based on hydrophobic and hydrophilic amino acids in the hope of making a folded polyester protein.

Go back to News & Views Index

"Historically, chemists have been interested in synthesizing molecules," says Professor Peter Schultz. "We are interested in synthesizing function." Photo by Biomedical Graphics.

The chemical structures of the amino acid tyrosine (top) and the novel amino acid O-methyl-tyrosine (bottom).

"You just need to add the novel amino acid to the culture and grow the cells," says graduate student Lei Wang. Photo by Jason S. Bardi.

"We're developing the leucine synthetase system to recognize a four-base codon," says graduate student Christopher Anderson. Photo by Jason S. Bardi.

"It's hard to find synthetases that recognize novel amino acids because nature has designed them to recognize only the natural ones," says Research Associate Steve Santoro. Photo by Jason S. Bardi.

"We will be able to probe protein interactions involved in human disease with unprecendented precision in living cells," says Research Associate Jason Chin. Photo by Jason S. Bardi.