About TSRI
Research & Faculty
News & Publications
Scientific Calendars
Scripps Florida
PhD Program
Campus Services
Work at TSRI
TSRI in the Community
Giving to TSRI
Directory
Library
Contact
Site Map & Search
TSRI Home

Scientific Report 2004


Cell Biology




New Tools and Applications for Cellular Proteomics


J.R. Yates III, G.T. Cantin, D. Cociorva, C. Delahunty, M.Q. Dong, L. Florens, J. Hewel, J.R. Johnson, M.J. MacCoss, I. MacLeod, W.H. McDonald, N. Muster, S. Niessen, C.I. Ruse, R. Sadygov, D.L. Tabb, J. Venable, J. Wohlschlegel, C. Wu, W.H. Zhu

Several advances in high-throughput technologies have created the foundation for the global proteomic analysis of complex mixtures and cellular lysates. The success and popularity of proteomics applications in studies of complex protein interactions in biology have created a demand for more powerful tools to analyze increasingly complex systems. Mass spectrometry is a key technology for realizing these goals, and we are leaders in (1) developing mass spectrometry–based methods that allow comprehensive analysis of very complex mixtures and (2) applying these emerging technologies to studies of existing biological problems.

Multidimensional protein identification technology (MudPIT) is a tool that we use extensively to analyze a wide range of samples. MudPIT involves the direct coupling of a multidimensional chromatographic separation system to a tandem mass spectrometer, providing data we term MS/MS spectra. Hundreds of thousands of MS/MS spectra can be collected on an automated MudPIT system in a 24-hour period, and sophisticated software is required to match spectra with peptide sequences in a database.

Recently, we developed a new algorithm to complement the existing Sequest algorithm for protein identification from the MS/MS data. Use of the new algorithm increases the confidence that the peptide indicated by a spectrum is correct by providing an independent measure of the quality of spectrum-peptide matches. In addition, de novo sequencing of the spectra can be used to extract partial peptide sequences so that less stringent searches against a sequence database are required, aiding in identifying posttranslationally modified peptides. As the proteomics field shifts toward methods that can provide a quantitative measure of protein expression, software tools for analyzing quantitative MS/MS data have also been developed to streamline and simplify data analysis.

We use MudPIT in various applications. We developed methods to optimize the analysis of membrane proteins, considered by many a major limitation of proteomics technologies, and applied the methods to studies of the proteome of the Golgi apparatus. We identified 41 previously uncharacterized proteins and characterized posttranslational modifications that had not been detected before.

Continuing with the large-scale proteomic analysis of Plasmodium falciparum, the parasite that causes malaria, we selected candidate antigens for vaccine development from proteomic data sets to observe the efficacy of the antigens in generating an immune response. We found that a much wider diversity of antigens than predicted can elicit a response. These results suggest that our understanding of antigenic immunodominance in the host response to complex pathogens is incomplete.

Finally, we used a subtractive proteomics strategy to identify integral membrane proteins of the nuclear envelope. All known components of the nuclear envelope were identified, and 67 uncharacterized open reading frames were detected. A total of 23 of the proteins identified mapped to chromosome regions that are linked to a variety of dystrophies. Approximately 300 dystrophies remain to be linked to a responsible gene, and the proportion of genes localized to disease loci was greater than what would be expected to occur randomly, suggesting that many of the 67 identified proteins are good candidates for disease links.

Publications

Doolan, D.L., Southwood, S., Freilich, D.A., Sidney, J., Graber, N.L., Shatney, L., Bebris, L., Florens, L., Dobano, C., Witney, A.A., Appella, E., Hoffman, S.L., Yates, J.R. III, Carucci, D.J., Sette, A. Identification of Plasmodium falciparum antigens by antigenic analysis of genomic and proteomic data. Proc. Natl. Acad. Sci. U. S. A. 100:9952, 2003.

MacCoss, M.J., Wu, C.C., Liu, H., Sadygov, R., Yates, J.R. III. A correlation algorithm for the automated quantitative analysis of shotgun proteomics data. Anal. Chem. 75:6912, 2003.

Sadygov, R.G., Liu, H., Yates, J.R. Statistical models for protein validation using tandem mass spectral data and protein amino acid sequence databases. Anal. Chem. 76:1664, 2004.

Schirmer, E.C., Florens, L., Guan, T., Yates, J.R. III, Gerace, L. Nuclear membrane proteins with potential disease links found by subtractive proteomics. Science 301:1380, 2003.

Tabb, D.L., Saraf, A., Yates, J.R. III. GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Anal. Chem. 75:6415, 2003.

Wu, C.C., MacCoss, M.J., Howell, K.E., Yates, J.R. III. A method for the comprehensive proteomic analysis of membrane proteins. Nat. Biotechnol. 21:532, 2003.

Wu, C.C., MacCoss, M.J., Mardones, G., Finnigan, C., Mogelsvang, S., Yates, J.R. III, Howell, K.E. Organellar proteomics reveals Golgi arginine dimethylation. Mol. Biol Cell 15:2907, 2004.

 


John Yates III, Ph.D.
Professor

Yates Web Site