Science Talk:
After the Genome, Part 2

The Paradigm Remains the Same

Ernest Beutler, Chair, Department of Molecular and Experimental Medicine

Having the sequence of the whole genome is a useful tool [rather than] an entirely new concept (In other words, I don't agree with the idea that "It's a new ballgame now that we know the genome"). Knowing the sequence allows us to move towards our objectives more rapidly—whether that objective is finding a new drug target or solving a problem with a genetic disease.

That's all it really is—a tool. And it's a tool that didn't suddenly become available when people announced that they had completed the sequence. We've been able to perform genome searches for the last 14 years. It's just that now, the database is more complete, although still not entirely so.

There are a lot of different ways to use this tool to find better drug targets. One is the forward genetic approach—random mutagenesis. If you produce a model with hypertension or diabetes or cancer as an inherited genotype, you perform linkage analysis. There are lots of markers that help you see what piece of which chromosomes have been inherited from which strain. Then you narrow the field to a single gene by what is known as positional cloning.

Could you positionally clone a gene before the genome was available? Sure, but it was a lot more work. It was a lot harder. Now areas that you would have had to sequence through are already sequenced. Markers that you would have had to find are already known. The availability of the genome allows you to find a mutant gene much faster, and, once you find the gene that causes the phenotype, then you may gain some understanding of why this strain has this particular disease. Once you have that understanding, then maybe you'll be able to treat the disease and its human counterpart.

The paradigm hasn't changed. Once you find the pathogenesis of a disease, then you try to correct that through drugs, diet, and a variety of different ways. Is [the genome] a revolution? I don't think so. But it's a very powerful tool for understanding disease processes, and it will move things along a lot faster.

Making a Mark

Steve Kay, Professor, Department of Cell Biology

Genomics aims to provide a definition of the gene content of an organism. Functional genomics takes the genome sequences and turns them into highly confident statements about how many genes there are and what they do. It's going to provide rich biological descriptions of function on a scale that was unimaginable a few years ago.

What is really exciting is that much of the effort at TSRI's Institute of Childhood and Neglected Diseases (ICND) is about neurobiology—trying to understand the molecular and cellular bases of psychiatric and neurological disorders. It is extremely difficult to define gene-by-gene how each may contribute to complex behavior, for example, the sleep/wake cycle and how it relates to depression; learning and memory; autism and neurodegeneration; cerebellum disorders like epilepsy and migraines; deafness.

One of the best ways of defining how genes contribute to these complex biological phenomena is to use an in vivo standard—a working physiological system to see how the genes work. At the ICND, we are fortunate to have a partnership with the Genomics Institute of the Novartis Research Foundation (GNF), which will make these models available for researchers. This is critically important for finding new targets for therapeutic intervention.

We can assay complex [phenotypes] and begin to determine which genes are important for biological function and which are important for dysfunction.

Genes want to express themselves and they do it differentially. Every single cell has the same DNA content, but they look different because of differential expression.

Differential expression mapping is a powerful way of assigning function. A lot of people in the ICND are interested in using gene expression analysis as a way of assigning function to genes. What we did recently was to identify a potassium channel that plays a key role in regulating the sleep/wake cycle. This provides a target for therapy.

The investment that the institute has made in facilities, recruitment, and opportunities for interacting with GNF in the past three years has provided the infrastructure we needed to carry out functional genomics. Now we are seeing it pay off in publications and in making a mark.

Proteomics, Mass Spectrometry, and Post-Translational Modification

John Yates, Professor, Department of Cell Biology

The genome sequence tells you a lot about an organism, but it doesn't tell you everything. The next level up is the proteome. What makes proteomics more challenging than genomics is the fact that [the proteome is dynamic]—depending on the given cellular state, environment, and so forth, you get a different proteome expressed. Although our genome is the same in every single cell in the body, the kinds of proteins that are expressed in those cells will be very different, depending on what the cell is supposed to be doing.

Lots of proteins come together to form complexes in order to carry out some physiological function. We need to know what proteins are coming together, when they come together, what's driving them together, what is the stoichiometry of those complexes—how many of each of these components are in there, things like that. We also need information about post-translational modifications, [which] turn out to be extremely interesting. There are lots and lots of modifications, and the most common modification that people correlate with regulation is phosphorylation.

Phosphorylation is very widespread and it's important to know when it's on and when it's off and how that correlates with particular physiological processes. You'll also find that acetylation and methylation may have regulatory roles. There's a lot of interest now in how those things might involve regulatory processesÉ Even though you may have X amount of a particular protein, how much of it is actually active? That may be a function of the modification state of that particular protein. Correlating those pieces of information will actually be quite interesting.

[In proteomics], you need to be able to measure quickly, robustly, and in a high-throughput fashion that kind of information, and mass spectrometry [enables one to do just that].

We use tandem mass spectrometry, [which] allows you to select peptides from the mixture and get fragmentation information that tells you the sequence. What my lab invented a long time ago is a way to take that mass spectrometry fragmentation pattern and correlate it with sequences in the database. So, as genomes get sequenced, that information goes into the database and it becomes very easy to associate that mass spectrometry data to those sequences in the database, and you can do this in a very high-throughput manner.

It also enables new approaches for looking at protein complexes, looking at the components of the cells, looking at where proteins might localize, [and studying post-translational modification]. If the computer program knows to check for something like phosphorylation, it'll go and search the sequence database. Every time it sees a residue that could potentially be phosphorylated it's going to ask the question, "is this phosphorylated or not?". And you can do that with almost any kind of modification in this world [with the possible exception of] carbohydrates...

The Study of the Glycome

James Paulson, Professor, Department of Molecular Biology

The Consortium for Functional Glycomics is focused on information carried by carbohydrates that mediate communication between cells. As part of that, we want to understand the regulation of the genes responsible for the synthesis of the carbohydrate structures required for mediating cell communication.

To that end, we have established a core resource that operates within the existing TSRI Gene Microarray Facility. The purpose is to create a custom array of genes that are related to the scope of the consortium. We've just finished assembling a list of about 1,500 human and murine genes that include glycosyl transferases, which are the enzymes that synthesize carbohydrates, and carbohydrate binding proteins, the proteins that bind the structures that appear on the surfaces of cells. And then there are classes of related genes of interest.

The array will be produced by Affymetrix [a firm that develops technology for acquiring, analyzing, and managing genetic information] who will take the list we developed and create a custom gene chip [an array of nucleotides on a solid surface]. We're doing that for two reasons. We'll have a well focused set of genes that are well annotated, and it's also cost-effective. Within two months, we should have [the chips] in-house.

Once they are available, investigators will apply to use the chips for experiments within the scope of the consortium. We will either send them the chips (if they have their own array facility), or they can send us the RNA needed to analyze gene expression using the chips, and we can do the work here and send them the data. In either case, the (raw and analyzed) data will be posted on a public web site.

Now that the genome has been completed, there is a general interest in the other layers of complexity that define an organism as complex as a human. In the post-genomic era, interest has shifted towards these other layers of complexity including glycomics, the study of the glycome [all the carbohydrates in the human body]. I don't think we're ready yet to do the same kind of comprehensive analysis with the glycome that has been done with the genome, because to do that, you would need to categorize every single carbohydrate that is produced on every single cell type in the organism. And even every glycoprotein produced in every cell.

We know, based on a lot of work through the years, that most of that information will not be particularly useful. It will be re-documenting over and over again very similar carbohydrate structures. That's why our focus has been functional glycomics, where we're trying to sort out the sugar structures that do make a difference, that do participate in biological interactions. Use of the glyco-gene microarray will be particularly valuable to understanding how glycosylation of a cell changes during differentiation and / or activation.