Scientific Report 2008
Nuclear Magnetic Resonance of 3-Dimensional
Structure and Dynamics of Proteins in Solution
P.E. Wright, H.J. Dyson, M. Arai, R.
Burge, P. Deka, J. Ferreon, T.-H. Huang, B.B. Koehntop, M. Kostic, B. Lee, C.W.
Lee, M. Landes, M. Martinez-Yamout, T. Nishikawa, K. Sugase, J. Wojciak, M.
Zeeb, E. Manlapaz, L.L. Tennant, D.A. Case, J. Gottesfeld
We use multidimensional
nuclear magnetic resonance (NMR) spectroscopy to investigate the structures, dynamics,
and interactions of proteins in solution. Such studies are essential for understanding
the mechanisms of action of these proteins and for elucidating structure-function
relationships. The focus of our current research is protein-protein and protein—nucleic
acid interactions involved in the regulation of gene expression.
Transcription Factor—Nucleic Acid Complexes
NMR methods are being used to determine
the 3-dimensional structures and intramolecular dynamics of zinc finger motifs
from several eukaryotic transcriptional regulatory proteins, both free and complexed
with target nucleic acid. Zinc fingers are among the most abundant domains in eukaryotic
genomes. They play a central role in the regulation of gene expression at both the
transcriptional and the posttranscriptional level, mediated through their interactions
with DNA, RNA, or protein components of the transcriptional machinery. The C2H2
zinc finger, first identified in transcription factor IIIA (TFIIIA), is used by
numerous transcription factors to achieve sequence-specific recognition of DNA.
Growing evidence, however, indicates that some C2H2 zinc finger
proteins control gene expression both through their interactions with DNA regulatory
elements and, at the posttranscriptional level, through binding to RNA.
The best-characterized example of a C2H2
zinc finger protein that binds specifically to both DNA and RNA is TFIIIA, which
contains 9 zinc fingers. We showed previously that different subsets of zinc fingers
are responsible for high-affinity binding of TFIIIA to DNA (fingers 1—3) and
to 5S RNA (fingers 4—6). To obtain insights into the mechanism by which the
TFIIIA zinc fingers recognize both DNA and RNA, we have used NMR methods to determine
the structures of the complex formed by zf1-3 (a protein consisting of fingers 1—3)
with DNA and by zf4-6 (a protein consisting of fingers 4—6) with a fragment
of 5S RNA.
Three-dimensional structures were determined
previously for the complex of zf1-3 with the cognate 15-bp oligonucleotide duplex.
The structures contain several novel features and reveal that prevailing models
of DNA recognition, which assume that zinc fingers are independent modules that
contact bases through a limited set of amino acids, are outmoded.
In addition to its role in binding to
and regulating the 5S RNA gene, TFIIIA also forms a complex with the 5S RNA transcript.
NMR structures of the complex formed by zinc fingers 4—6 with a truncated form
of 5S RNA have been completed and give important insights into the structural basis
for 5S RNA recognition. Finger 4 of the protein recognizes both the structure of
the RNA backbone and the specific bases in the loop E motif of the RNA, in a classic
lock-and-key interaction. Fingers 5 and 6, with a single residue between them, undergo
mutual induced-fit folding with the loop A region of the RNA, which is highly flexible
in the absence of the protein.
NMR studies of 2 alternate splice variants
of the Wilms tumor zinc finger protein (WT1) are in progress. These proteins differ
only through insertion of 3 additional amino acids (the tripeptide lysine-threonine-serine)
in the linker between fingers 3 and 4, yet have marked differences in their DNA-binding
properties and subcellular localization. 15N relaxation measurements
indicate that the insertion increases the flexibility of the linker between fingers
3 and 4 and abrogates binding of the fourth zinc finger to its cognate site in the
DNA major groove, thereby modulating DNA-binding activity. X-ray and NMR structures
of the complexes of the WT1 zinc fingers with 14- and 17-bp DNA oligonucleotides
have been determined. Zinc fingers 2—4 are inserted deeply into the DNA major
groove, making sequence-specific contacts with bases. The structure provides insights
into the mechanism by which disease-causing mutations in the zinc finger domain
interfere with DNA binding. In contrast to fingers 2—4, zinc finger 1 has mostly
nonspecific interactions with the DNA. High-affinity DNA binding is mediated by
fingers 2—4; incorporation of additional amino acids in the linker by alternate
splicing disrupts the finger 4 interactions and abrogates DNA binding.
NMR structural studies of a complex of
the 4 WT1 zinc fingers with an RNA aptamer are nearing completion. In contrast to
DNA binding, the RNA interaction is dominated by zinc fingers 1—3, which bind
in the widened major groove formed in the vicinity of a bulged base. The interactions
of zinc finger 4 with the RNA loop make only a secondary contribution to binding
affinity. We have also determined the structure of a novel double-stranded RNA-binding
zinc finger protein and have commenced experiments to define the mechanism of binding
to adenovirus VA1 RNA.
We recently determined the structure
of a novel zinc finger protein named Churchill that is involved in regulation of
neural induction during embryogenesis. At the time of its discovery, it was suggested
that the protein contained 2 zinc fingers of the C4 type and functioned
as a DNA-binding transcription factor. Our NMR structure shows that far from containing
canonical C4 zinc fingers, Churchill contains 3 bound zinc ions in novel
coordination sites, including an unusual binuclear zinc cluster, which jointly stabilize
a single-layer β-sheet
(Fig. 1). We showed further that Churchill does not bind DNA and suggest that it
may function in embryogenesis by mediating protein-protein interactions.
|Fig.1. Structure of Churchill.
Protein-Protein Interactions in Transcriptional Regulation
Transcriptional regulation in eukaryotes
relies on protein-protein interactions between DNA-bound factors and coactivators
that, in turn, interact with the basal transcription machinery. The transcriptional
coactivator CREB-binding protein (CBP) and its homolog p300 play an essential role
in cell growth, differentiation, and development. Understanding the molecular mechanisms
by which CBP and p300 recognize their various target proteins is of fundamental
biomedical importance. CBP and p300 have been implicated in diseases such as leukemia,
cancer, and mental retardation and are novel targets for therapeutic intervention.
We previously determined the structure
of the kinase-inducible activation domain of the transcription factor CREB bound
to its target domain (the KIX domain) in CBP. Ongoing work is directed toward mapping
the interactions between KIX and the transcriptional activation domains of the proto-oncogene
c-Myb and of the mixed-lineage leukemia protein. The solution structure of the ternary
complex between KIX, c-Myb, and the mixed-lineage leukemia protein has been completed
and provides insights into the structural basis for the ability of the KIX domain
to interact simultaneously and allosterically with 2 different effectors. Our work
has also provided new understanding of the thermodynamics of the coupled folding
and binding processes involved in interaction of KIX with transcriptional activation
domains. We used R2 relaxation dispersion experiments to elucidate the
mechanism by which folding of the kinase-inducible activation domain of CREB is
coupled to binding to its KIX target domain. These experiments revealed formation
of an ensemble of transient and largely unfolded encounter complexes at multiple
sites on the surface of KIX. The encounter complexes are stabilized primarily by
nonspecific hydrophobic contacts and evolve via an intermediate to the fully bound
state without dissociation from KIX. The C-terminal helix of the kinase-inducible
domain is only partially folded in the intermediate and becomes stabilized by intermolecular
interactions formed in the final bound state. Future applications of our method
will provide new understanding of the molecular mechanism by which intrinsically
disordered proteins perform their diverse biological functions.
Recently, we determined the structure of the
complex between the hypoxia-inducible factor Hif-1α
and the TAZ1 domain of CBP. The interaction between Hif-1α
and CBP/p300 is of major therapeutic interest because of the central role Hif-1α
plays in tumor progression and metastasis; disruption of this interaction leads
to attenuation of tumor growth. A protein named CITED2 functions as a negative feedback
regulator of the hypoxic response by competing with Hif-1α
for binding to the TAZ1 domain of CBP. By determining the structure of the complex,
we showed that the intrinsically unstructured Hif-1α
and CITED2 domains use partly overlapping surfaces of the TAZ1 motif to achieve
high-affinity binding and compete effectively with each other for CBP/p300.
To further elucidate the molecular and
structural basis for CBP-dependent coordinated gene expression, we have determined
the solution structures of the complexes formed by the transactivation domains of
the transcription factors STAT2 and STAT1 with CBP TAZ1 and TAZ2 domains, respectively.
Despite the overall topological similarity of the CBP TAZ domains, the structures
reveal 2 very different modes of complex formation. Our findings suggest that TAZ1
may bind activation domains capable of contacting multiple surface grooves simultaneously
in preference to smaller activation motifs that are restricted to a single, contiguous
binding surface. The latter mode of binding is sufficient for stable complex formation
with TAZ2. Binding of both STAT activation domains involves coupled folding and
We are continuing to map the multiplicity
of interactions between CBP/p300 domains and their numerous biological targets.
Our goal is to understand the complex interplay of interactions that mediate key
biological processes in health and disease.
Ebert, M.-O., Bae, S.-H., Dyson, H.J.,
Wright, P.E. NMR relaxation
study of the complex formed between CBP and the activation domain of the nuclear
hormone receptor coactivator ACTR. Biochemistry 47:1299, 2008.
Lee, B.M., Buck-Koehntop, B.A., Martinez-Yamout,
M.A. Dyson, H.J., Wright, P.E. Embryonic
neural inducing factor Churchill is not a DNA-binding zinc finger protein: solution
structure reveals a solvent-exposed β -sheet
and zinc binuclear cluster J. Mol. Biol. 371:1274, 2007.
Stoll, R., Lee, B.M., Debler, E.W.,
Laity, J.H., Wilson, I.A., Dyson, H.J., Wright, P.E.
Structure of the Wilms tumor suppressor protein zinc finger domain bound to DNA.
J. Mol. Biol. 372:1227, 2007.
Sugase, K., Landes, M.A., Wright,
P.E., Martinez-Yamout, M.A.
Overexpression of post-translationally modified peptides in Escherichia coli
by co-expression with modifying enzymes. Protein Expr. Purif. 57:108, 2008.
Sugase, K., Lansing, J.C., Dyson,
H.J., Wright, P.E. Tailoring
relaxation dispersion experiments for fast-associating protein complexes J. Am.
Chem. Soc. 129:13406, 2007.
Folding of Proteins and Protein
P.E. Wright, H.J. Dyson, D. Meinhold,
C. Nishimura, D. Felitsky, M. Kostic, S.J. Park, J. Chung, L.L. Tennant, V.
Bychkova,* T. Yamagaki
Institute of Protein Research, Puschino, Russia
mechanism by which proteins fold into their 3-dimensional structures remains one
of the most important unsolved problems in structural biology. Nuclear magnetic
resonance (NMR) spectroscopy is uniquely suited to provide information on the structure
of transient intermediates formed during protein folding. Previously, we used NMR
methods to show that many peptide fragments of proteins have a tendency to adopt
folded conformations in water solution. The presence of transiently populated folded
structures, including reverse turns, helices, nascent helices, and hydrophobic clusters,
in water solutions of short peptides has important implications for initiation of
protein folding. Formation of elements of secondary structure probably plays an
important role in the initiation of protein folding by reducing the number of conformations
that must be explored by the polypeptide chain and by directing subsequent folding
Apomyoglobin Folding Pathway
A major program in our laboratory is
directed toward a structural and mechanistic description of the apomyoglobin folding
pathway. Previously, we used quenched-flow pulse-labeling methods in conjunction
with 2-dimensional NMR spectroscopy to map the kinetic folding pathway of the wild-type
protein. With these methods, we showed that an intermediate in which the A, G, and
H helices and part of the B helix adopt hydrogen-bonded secondary structure is formed
within 6 milliseconds of the initiation of refolding. Folding then proceeds by stabilization
of additional structure in the B helix and in the C and E helices. We are using
carefully selected myoglobin mutants and both optical stopped-flow spectroscopy
and NMR methods to further probe the kinetic folding pathway. For some of the mutants
studied, the changes in amino acid sequence resulted in changes in the folding pathway
of the protein.
These experiments are providing novel
insights into both the local and long-range interactions that stabilize the kinetic
folding intermediate. of particular importance, long-range interactions have been
observed that indicate
nativelike packing of some of the helices in the kinetic molten globule intermediate.
However, folding is impeded by local nonnative helix packing; the H helix is translocated
relative to the G helix by a single helical turn, and folding cannot proceed until
this defect is repaired.
Apomyoglobin provides a unique opportunity
for detailed characterization of the structure and dynamics of a protein-folding
intermediate. Conditions were previously identified under which the apomyoglobin
molten globule intermediate is sufficiently stable for acquisition of multidimensional
heteronuclear NMR spectra. Analysis of 13C and other chemical shifts
and measurements of polypeptide dynamics provided unprecedented insights into the
structure of this state.
The A, G, and H helices and part of the
B helix are folded and form the core of the molten globule. This core is stabilized
by relatively nonspecific hydrophobic interactions that restrict the motions of
the polypeptide chain. Fluctuating helical structure is formed in regions outside
the core, although the amount of helix is low and the chain retains considerable
flexibility. The F helix acts as a gate for heme binding and only adopts stable
structure in the fully folded holoprotein.
The acid-denatured (unfolded) state of
apomyoglobin is an excellent model for the fluctuating local interactions that lead
to the transient formation of unstable elements of secondary structure and local
hydrophobic clusters during the earliest stages of folding. NMR data indicated substantial
formation of helical secondary structure in the acid-denatured state in regions
that form the A and H helices in the folded protein and also revealed nonnative
structure in the D and E helix regions.
Because the A and H regions adopt stabilized
helical structure in the earliest detectable folding intermediate, these results
lend strong support to folding models in which spontaneous formation of local elements
of secondary structure plays a role in initiating formation of the A-[B]-G-H molten
globule folding intermediate. In addition to formation of transient helical structure,
formation of local hydrophobic clusters has been detected by using 15N
relaxation measurements. Significantly, these clusters are formed in regions where
the average surface area buried upon folding is large. In contrast to acid-denatured
unfolded apomyoglobin, the urea-denatured state is largely devoid of structure,
although residual hydrophobic interactions have been detected by using relaxation
We have measured residual dipolar couplings
for unfolded states of apomyoglobin by using partial alignment in strained polyacrylamide
gels. These data provide novel insights into the structure and dynamics of the unfolded
polypeptide chain. We have shown that the residual dipolar couplings arise from
the well-known statistical properties of flexible polypeptide chains. Residual dipolar
couplings provide valuable insights into the dynamic and conformational propensities
of unfolded and partly folded states of proteins and hold great promise for charting
the upper reaches of protein-folding landscapes.
To probe long-range interactions in unfolded
and partially folded states of apomyoglobin, we introduced spin-label probes at
several sites throughout the polypeptide chain. These experiments led to the surprising
discovery that transient structures with nativelike long-range contacts between
hydrophobic clusters exist within the ensemble of conformations formed by the acid-denatured
state of apomyoglobin. They also indicated that the packing of helices in the molten
globule state is similar to that in the native folded protein. The relative amounts
of the transiently collapsed states formed in the apomyoglobin polypeptide chain
are determined by the entropic cost of loop closure. The specificity of the long-range
contacts in the most structured of these states suggests that the contacts play
a key role in directing chain collapse and initiating folding.
The view of protein folding that results
from our work on apomyoglobin is one in which collapse of the polypeptide chain
to form increasingly compact states leads to progressive accumulation of secondary
structure and increasing restriction of fluctuations in the polypeptide backbone.
Chain flexibility is greatest at the earliest stages of folding, when transient
elements of secondary structure and local hydrophobic clusters are formed. As the
folding protein becomes increasingly compact, backbone motions become more restricted,
the hydrophobic core is formed and extended, and nascent elements of secondary structure
are progressively stabilized. The ordered tertiary structure characteristic of the
native protein, with well-packed side chains and relatively low-amplitude local
dynamics, appears to form rather late in folding.
We recently introduced a variation on
the classic quench-flow technique, which makes use of the capabilities of modern
NMR spectrometers and heteronuclear NMR experiments, to study the proteins labeled
along the folding pathway in an unfolded state in an aprotic organic solvent. This
method allows detection of many more amide proton probes than in the classic
method, which requires formation of the fully
folded protein and the measurement of the protein's NMR spectrum in water solution.
This method is particularly useful in documenting changes in the folding pathway
that result in the destabilization of parts of the protein in the molten globule
intermediate. We recently showed that self-compensating mutations designed to change
the amino acid sequence such that the average area buried upon folding is significantly
changed while the 3-dimensional structure of the final folded state remains the
same. These studies showed that the average area buried upon folding is an accurate
predictor of those parts of the apomyoglobin molecule that will fold first and participate
in the molten globule intermediate. Quench-flow hydrogen exchange experiments performed
on a series of hydrophobic core mutants indicated that the overall helix-packing
topology of the kinetic folding intermediate is like that of the native protein,
despite local nonnative interactions in packing of the G and H helices (Fig. 1).
Finally, using a rapid mixing device, we have reduced the dead time of the kinetic
refolding experiments and have shown that a compact helical intermediate is formed
within 400 microseconds after initiation of apomyoglobin refolding. The new measurements
reveal that folding occurs by a hierarchical process: the A, G, and H helices fold
rapidly to form a compact core, and the other helices fold more slowly by docking
onto the preformed core.
Schematic representation of the amide proton occupancies in the kinetic intermediate
state formed in the burst phase of apomyoglobin folding (solid ribbons), mapped
onto the structure of fully folded myoglobin. Areas of the protein that are not
folded until later stages are shown as dotted lines.
Folding-Unfolding Transitions In Cellular Metabolism
Many species of bacteria sense and respond
to their own population density by an intricate autoregulatory mechanism known as
quorum sensing; bacteria release extracellular signal molecules, called autoinducers,
for cell-cell communication within and between bacterial species. A number of bacteria
appear to use quorum sensing for regulation of gene expression in response to fluctuations
in cell population density. Processes regulated in this way include symbiosis, virulence,
competence, conjugation, production of antibiotics, motility, sporulation, and formation
We determined the 3-dimensional solution
structure of a complex composed of the N-terminal 171 residues of the quorum-sensing
protein SdiA of Escherichia coli and an autoinducer molecule, N-octanoyl-L-homoserine
lactone (HSL). The SdiA-HSL system shows the "folding switch behavior
associated with quorum-sensing factors produced by other bacterial species. In the
presence of HSL, SdiA is stable and folded and can be produced in good yields from
an E coli expression system. In the absence of the autoinducer, the SdiA
is expressed into inclusion bodies. Samples of the SdiA-HSL complex can be denatured
but cannot be refolded in aqueous buffers. The solution structure of the complex
provides a likely explanation for this behavior. The autoinducer molecule is tightly
bound in a deep pocket in the hydrophobic core and is bounded by specific hydrogen
bonds to the side chains of conserved residues. The autoinducer thus forms an integral
part of the hydrophobic core of the folded SdiA.
Chaperone—Cochaperone—Client Protein Interactions
Understanding the role of unfolded states
in cellular processes will require an understanding of the structural basis of their
interactions, but unfolded proteins are impossible to characterize structurally
by x-ray crystallography, and spectroscopic methods of all kinds are limited. Unfolded
proteins must be explored under conditions that approximate the proteins' physiologic
milieu: in solution, at physiologic pHs and salt concentrations, and in the presence
of specific cofactors. Structural insights will be obtained not only from the delineation
of 3-dimensional structures but also from the description of conformational ensembles
and of the motions of polypeptide chains under various conditions.
To gain new insights into the structural
basis for the ability of unfolded and partly folded proteins to function in living
systems, we study the interactions of "client proteins and cochaperones
with a well-known eukaryotic chaperone, Hsp90. Some of the protein components are
much larger than have traditionally been studied
by using solution NMR. However, we have designed a set of experiments that will
allow us to draw valid conclusions about the extent and role of disorder in Hsp90
interactions. In particular, we will apply techniques recently developed in our
laboratory for analyzing hydrogen-deuterium exchange from unstable partially folded
proteins by trapping the 2H-labeled species in the aprotic solvent dimethyl
sulfoxide. This powerful new technique will be used to probe the structure, stability,
and interactions of client proteins and cochaperones with Hsp90.
Felitsky, D.J., Lietzow, M.A., Dyson,
H.J., Wright, P.E. Modeling
transient collapsed states of an unfolded protein to provide insights into early
folding events. Proc. Natl. Acad. Sci. U. S. A. 105:6278, 2008.
Nishimura, C., Dyson, H.J., Wright,
P.E. The kinetic and equilibrium
molten globule intermediates of apoleghemoglobin differ in structure. J. Mol. Biol.
Schwarzinger, S., Mohana-Borges, R.,
Kroon, G.J.A., Dyson, H.J., Wright, P.E. Structural
characterization of partially folded intermediates of apomyoglobin H64F. Protein
Sci. 17:313, 2008.