Structure of the Forkhead Domain of FOXP2 Bound to DNA, Antropologia, Paleoantropologia, Neandertalczyk, Genetyka

[ Pobierz całość w formacie PDF ]
, 159–166, January 2006
2006 Elsevier Ltd All rights reserved DOI 10.1016/j.str.2005.10.005
Structure of the Forkhead Domain
of FOXP2 Bound to DNA
Department of Chemistry and Biochemistry
University of Colorado at Boulder
Boulder, Colorado 80309
Max Planck Institute for Evolutionary Anthropology
D-04103 Leipzig
Australian Synchrotron Research Program/
Consortium for Advanced Radiation Sources
Argonne National Laboratory
Building 434-B
9700 South Cass Avenue
Argonne, Illinois 60439
), and thyroid agenesis with cleft palate and choa-
nal atresia (FOXE1) (
Clifton-Bligh et al., 1998
). Often,
these mutations are within the well-conserved forkhead
domain (
Carlsson and Mahlapuu, 2002
), demonstrating
the importance of DNA recognition and binding to the
function of FOX proteins.
FOXP (FOXP1–4) is a newly defined subfamily of the
FOX transcription factors that contains several recogniz-
able sequence motifs, including a glutamine-rich region,
a zinc finger, a leucine zipper, and a highly divergent
forkhead domain (
Lai et al., 2001; Li and Tucker, 1993;
Shu et al., 2001
). As seen in several other FOX proteins
linked to human developmental disorders (
et al., 2003
), the majority of disease-causing mutations
in FOXP2 and FOXP3 occur in the forkhead domain. For
instance, an arginine-to-histidine missense mutation
(R553H) in the FOXP2 forkhead domain has been linked
to a severe speech and language disorder (
Lai et al.,
). A deletion of the forkhead domain arising from a
frame-shift mutation in the FOXP3 protein in mouse is
linked to the autoimmune disorder scurfy (
et al., 2001; Schubert et al., 2001
). A similar congenital
disease in human is known as IPEX (immune dysregula-
tion, polyendocrinopathy, enteropathy, X-linked syn-
drome) (
Bennett et al., 2001; Wildin et al., 2001
). Afflicted
individuals display a variety of symptoms that include
anemia, insulin-dependent diabetes, chronic diarrhea,
and dermatitis (
Levy-Lahad and Wildin, 2001
). The sim-
ilarities between human IPEX and mouse scurfy pheno-
types are reflected by the fact that several mutations in
the forkhead domain of the human FOXP3 gene have
been linked to IPEX (
Bennett et al., 2001; Wildin et al.,
To address the mechanisms by which these disease-
related mutations disturb FOXP function, we have deter-
mined the structure of the human FOXP2 forkhead
domain bound to DNA containing a FOXP binding site
Schubert et al., 2001; Wang et al., 2003
) (
Table 1
). Our
results show that disease-causing mutations in the
FOXP family map to the DNA binding interface and to
a dimer interface formed by domain swapping. Domain
swapping can be disrupted by replacing an alanine con-
served in the FOXP family with a proline that is highly
conserved in other FOX families. These results suggest
that domain swapping is a unique structural feature of
the FOXP forkhead domain and thus may be functionally
relevant. Additionally, the high resolution of the data
allows us to reinterpret earlier, lower resolution studies
and propose a general model of DNA recognition by
forkhead-containing proteins.
FOXP (FOXP1–4) is a newly defined subfamily of the
forkhead box (FOX) transcription factors. A mutation
in the FOXP2 forkhead domain cosegregates with a se-
vere speech disorder, whereas several mutations in
the FOXP3 forkhead domain are linked to the IPEX syn-
drome in human and a similar autoimmune phenotype
in mice. Here we report a 1.9
crystal structure of the
forkhead domain of human FOXP2 bound to DNA. This
structure allows us to revise the previously proposed
DNA recognition mechanism and provide a unifying
model of DNA binding for the FOX family of proteins.
Our studies also reveal that the FOXP2 forkhead do-
main can form a domain-swapped dimer, made possi-
ble by a strategic substitution of a highly conserved
proline in conventional FOX proteins with alanine in
the P subfamily. Disease-causing mutations in FOXP2
and FOXP3 map either to the DNA binding surface
or the domain-swapping dimer interface, functionally
corroborating the crystal structure.
Forkhead box (FOX)-containing transcription factors are
unified by sequence similarity within an approximately
90 amino acid winged-helix DNA binding domain from
which the FOX family derives its name (
Mazet et al.,
). The diverse roles of FOX (human protein names
are used throughout) family members in development
are underscored by the fact that mutations in several
members of the family are linked to congenital defects,
including familial glaucoma and Axenfeld-Rieger anom-
alies (FOXC1) (
Lehmann et al., 2000; Mears et al., 1998;
Mirzayans et al., 2000; Nishimura et al., 1998, 2001
), lym-
phedema (FOXC2) (
Fang et al., 2000; Finegold et al.,
), T cell immunodeficiency (FOXN1) (
Frank et al.,
Results and Discussion
Overall Structure
The asymmetric unit (ASU) contains six copies of the
FOXP2 forkhead domain and two double-stranded seg-
ments of DNA (
Figure 1
A). Although all six copies of
FOXP2 are identical in sequence, two copies exist in a
monomeric form (
Figure 1
A, labeled 1 and 2) and the four
other copies exhibit domain swapping (described in
Lab address:
These authors contributed equally to this work.
James C. Stroud,
Yongqing Wu,
Darren L. Bates,
Aidong Han,
Katja Nowick,
Svante Paabo,
Harry Tong,
and Lin Chen
Table 1. Statistics of Crystallographic Analysis
extensively with both helix H3 and the sugar-phosphate
backbone, thereby wedging helix H3 deep into the major
groove and stabilizing the protein-DNA complex (
ure 2
A). In the periphery of the FOXP2/DNA interface,
residues from the N and C termini (Arg504, Thr508,
Arg583, and Arg584), including the main chain amide
of Tyr509 at the N-terminal end of helix H1 and residues
from S2 (e.g., Arg564), make hydrogen bonds, van der
Waals contacts, and electrostatic interactions with the
DNA backbone, providing further stability to the FOXP2/
DNA complex.
Based on the major groove contacts, the DNA binding
site of FOXP2 can be defined as 5
(the core
binding sequence is in bold) (
Figure 2
B), which is similar
to that derived from in vitro selection (5
Wang et al., 2003
). DNA binding by FOXP2 shares
a similar global structure with the FOXA3/DNA complex
Clark et al., 1993
). Surprisingly, the binding site deter-
mined for FOXA3 in the previous crystallographic study
was 5
(underlined), significantly dif-
ferent from that seen here for FOXP2. The DNA binding
mechanism of FOXA3 derived from the early study was
also significantly different from that described for
FOXP2 above. However, upon careful examination, we
found that conserved residues on helix H3 of FOXA3
(corresponding to Arg553, Asn550, His554, and Ser557
in FOXP2) bind a DNA region (bold) in 5
similarly to their counterparts in FOXP2. Nota-
bly, these residues are highly conserved among all
known members of the FOX family (
Figure 1
C). We pro-
pose that this region is a cryptic FOX binding site in the
FOXA3/DNA complex (
Clark et al., 1993
). The revised
interpretation of the DNA binding mechanism is not a re-
sult of the different DNA sequences used in the FOXA3/
DNA complex and the FOXP2/DNA complex, as con-
served DNA binding residues of FOXA3 and FOXP2 en-
gage in similar DNA binding interactions in the two com-
plexes. Thus, based on the common features of protein/
DNA interactions in the FOXA3/DNA complex and the
present structure, we are able to redefine the FOX bind-
ing sequence (5
) at the structural level,
which is consistent with the footprinting of FOXA3 and
biochemical data on the binding site of a number of FOX
proteins, including FOXK1 (5
), FOXC2 (5
), and FOXD1 (5
et al., 1989; Jin et al., 1999; Liu et al., 2002; Nirula
et al., 1997; van Dongen et al., 2000
). A major difference
in DNA binding between FOXP2 and FOXA3 is at the pe-
ripheral protein/DNA interface, where FOXA3 uses two
loops (W1 and W2) to bind the DNA backbone and minor
groove extensively. The corresponding loops in FOXP2
are much shorter and make limited DNA contacts (
et al., 1993
). Consistent with these structural observa-
tions, the forkhead domain of FOXP2 binds DNA with
a lower affinity than that of FOXA3 (
Clark et al., 1993;
Li et al., 2004
Compared with most sequence-specific transcription
factors, an unusual feature of DNA binding by FOXP2 is
its extensive utilization of van der Waals contacts and a
relatively small number of hydrogen bonds to bases in
the major groove. This shape recognition may allow
FOXP2 to bind a broad range of sequences in different
promoter contexts as long as the DNA maintains the
few hydrogen bond determinants in the core region of
Data Collection
Resolution (A
Completeness (%)
99.7 (99.1)
28.46 (4.8)
Resolution (A
R factor
0.217 (0.243)
0.235 (0.278)
Rms deviations
Bond lengths (A
) 0.007
Bond angles (º) 1.1
Average B factor (A
) 37.6
= SjI
<I>j/S I, where I is the observed intensity and <I> is the
statistically weighted average intensity of multiple observations of
symmetry-related reflections.
Numbers in parentheses are for the outer shell.
R factor = S kF
k/S jF
j, where jF
j and jF
j are observed and
calculated structure factor amplitudes, respectively. R
is calcu-
lated for a randomly chosen 9.1% of reflections.
detail below;
Figure 1
A, labeled 3–6). The two FOXP2
monomers bind intimately to equivalent sites on the two
segments of DNA (described below), whereas the two
swapped dimers loosely associate with DNA. The DNA-
bound monomeric form folds into the canonical winged-
helix motif characteristic of the FOX family (
Clark et al.,
Figure 1
B). Its core is comprised of three stacking
a helices (H1, H2, and H3) capped at one end by a three-
stranded antiparallel b sheet (S1, S2, and S3). The turn
between H2 and H3 contains a 3
helix (H4) as seen in
other FOX proteins (
Clark et al., 1993; Jin et al., 1999;
Liu et al., 2002; Weigelt et al., 2001
Between strands S2 and S3, conventional FOX pro-
teins contain a 5–7 amino acid insert, called wing 1.
However, in FOXP2 this insert is truncated, resulting in
a simple type I turn that joins strands S2 and S3 (
ure 1
C). The C-terminal region also distinguishes the
FOXP subfamily from most other FOX proteins. In FOXA3
this region forms an extended loop, called wing 2 (W2),
that contacts DNA extensively (
Clark et al., 1993
). The
corresponding region in FOXP2 forms a helix (H5) that
runs atop H1 and terminates at the DNA phosphate
backbone (
Figure 1
B). A similar helix H5 is also observed
in the NMR structures of FOXD1 and FOXK1a (ILF-1), but
the sequences and trajectories of these helices are nota-
bly different from that of FOXP2 (
Jin et al., 1999; Liu
et al., 2002
). The heightened variability of the W1 and
W2 regions relative to the rest of the forkhead domain
across all FOX subfamilies suggests that the wings may
have specialized functions within each subfamily.
DNA Recognition
DNA recognition by FOXP2 is mediated predominantly
by helix H3 (
Figure 2
). Asn550 forms bidentate hydrogen
bonds with Ade10, whereas His554 and Arg553 form di-
rect or water-mediated hydrogen bonds with Thy10
, respectively. The main chain and side chain
atoms of Arg553, His554, Ser557, and Leu558 also make
extensive van der Waals contacts to Cyt8, Gua8
, Thy9
, Thy11
, Ade12
, and Ade13
. A number of aro-
matic or hydrophobic residues from helix H1 (Tyr509),
H2 (Leu527 and Tyr531), and strand S3 (Trp573) interact
Structure of FOXP2 Bound to DNA
Figure 1. Overall Structure
(A) The asymmetric unit. FOXP2 molecules
are shown as ribbon drawings. Molecules
within the swapped dimers are both orange
(3 and 5) and cyan (4 and 6). Monomers (1
and 2) are orange. The DNA phosphate back-
bone is shown as a coil in magenta.
(B) Ribbon drawing of FOXP2 in the mono-
meric form bound to DNA. The sequence of
the region of DNA pictured is below the com-
plex. DNA is shown as wire frame.
(C) Sequence alignment. FOXP proteins are
separated from other FOX proteins by a
dashed line. Differences in secondary struc-
ture in the first half of the protein (residues
503–544) between the monomer (Mono, or-
ange) and the dimer (Swap, cyan) are shown
below the sequence. The second half (resi-
dues 545–584) has the same secondary
structure in the monomer (shown in orange)
and dimer (not shown). Residues that require
significant backbone changes between the
monomeric and swapped forms are indicated
as a cyan dash superimposed on the mono-
meric secondary structure representation.
Residues involved in DNA binding (shaded
in magenta) and intermolecular interactions
in the swapped dimer (cyan circles above
the sequence) are highlighted. The arginine
(R553) linked to speech disorder is indicated
by a green filled box. Residues homologous
to those of FOXP3 linked to autoimmune dis-
eases are indicated by yellow filled boxes.
The alanine (A539) found to be critical for
swapping is shown with shaded background.
the binding site and has shape complementary to the
DNA binding surface of FOXP2. Because DNA-contact-
ing residues on H3 are almost absolutely conserved (
ure 1
C), this DNA binding behavior is likely common to all
FOX proteins, which do not recognize a single consensus
sequence but rather a degenerate pattern: 5
Carlsson and Mahlapuu,
). Although there is evidence that a leucine zipper
motif preceding the forkhead domain may be required
for high-affinity DNA binding by FOXP proteins (
et al., 2004
), our studies here suggest that the isolated
FOXP forkhead domain is capable of specific DNA bind-
ing based on a number of observations. First, the FOXP2
forkhead domain binds its cognate site in two indepen-
dent complexes of the crystal asymmetric unit. Second,
the detailed binding interactions observed at the FOXP2/
DNA interface are conserved in the FOXA3/DNA com-
plex. Finally, the DNA binding mechanism derived from
the FOXP2/DNA complex is consistent with biochemical
data (see above). However, given the short recognition
sequence and relatively weak DNA binding affinity of
the forkhead domain of FOXP (
Li et al., 2004
), it is likely
that specific DNA binding by FOXP proteins in vivo will
be facilitated by protein/protein interactions in higher
order transcription factor complexes (
Bettelli et al.,
2005; Li et al., 2004
Figure 3. Structure of Domain Swapping
(A) Top view of domain swapping in FOXP2. The two FOXP2 fork-
head domains are represented as orange (labeled) and cyan. Bottom
panel: two monomers have been placed in positions to illustrate the
rearrangements required for swapping.
(B) Side view of domain swapping. This view is rotated 90º around
the horizontal axis relative to (A).
(C) Stereodiagram of electron density around the core of the swap-
ped dimer interface.
(D) A number of hydrophobic residues exposed on the surface of the
monomeric FOXP2 bound to DNA. These residues include Pro506,
Phe507, Phe538, and Trp533. These residues become buried in the
domain-swapped dimer.
Figure 2. DNA Recognition
(A) Detailed interactions between the forkhead domain of human
FOXP2 (orange) and its cognate DNA site (magenta). The DNA and
protein residues are drawn as a stick model.
(B) Schematic of interactions between FOXP2 and DNA. DNA is rep-
resented as a ladder with bases as ovals and labeled according to
the text (the core sequence is highlighted by thick lines). The back-
bone phosphates are represented as circles with the letter P inside.
Hydrogen bonding interactions are solid arrows while van der Waals
interactions are dashed arrows. Secondary structure elements of
FOXP2 are boxed and labeled. Highly conserved residues that con-
tribute to DNA specificity in the FOXP2/DNA and FOXA3/DNA com-
plexes are highlighted in red. A water molecule is represented as
a circle with a W inside.
shows part of this interaction network around the 2-fold
axis of the swap. Here, Phe541 stacks face to face with
its pseudosymmetry mate, Phe538 packs face to edge
against Phe541 of its dimer partner, and Tyr540 stacks
edge to face against Phe541 of the same FOXP2 copy.
The swapped dimer buries several hydrophobic resi-
dues that are exposed in the DNA-bound monomeric
species, including Pro506, Phe507, Trp533, and Phe538.
However, these residues are highly conserved in mono-
meric FOX proteins (
Clark et al., 1993
Figure 3
D). Thus,
burial of exposed hydrophobic residues in FOXP2 must
be supplemented by other factors that contribute to its
propensity for domain swapping.
Domain swapping in FOXP2 is a result of the exten-
sion of helix H2 through the turn connecting H2 to H3
Figures 3
A and
A), which creates a single straight 15
amino acid a helix in place of the shorter helices of H2
and H4. This region, corresponding to residues 538–
541 (FAYF) in FOXP2, is highly conserved in all FOX pro-
teins, except for residue Ala539. In classical FOX pro-
teins, this position is occupied by a proline (
Figure 1
which most likely prevents the merging of helices H2
and H4 and therefore precludes domain swapping. This
proline is strategically replaced by an alanine residue in
all FOXP members, suggesting that domain swapping is
a common feature in the FOXP family. Thus, contrary to
many cases of 3D domain swapping observed under
nonphysiological conditions or with artificially mutated
proteins, it seems that domain swapping is an adaptive
structural feature of the P branch of FOX proteins. Con-
sistent with this hypothesis, we have shown that the
forkhead domain of FOXP2 exists as both a monomer
and dimer in solution with a slow exchange rate (
ure 4
B) (see
Experimental Procedures
for further details).
Domain Swapping
A striking structural feature of FOXP2 is its propensity to
form a domain-swapped dimer wherein two monomers
of FOXP2 exchange helix H3, and strands S2 and S3
Figure 3
A). The swapping buries an additional 804
of solvent-accessible surface and creates a semicircular
arch with a pseudo 2-fold symmetry (
Figure 3
B). Helices
H2, H4, and H3 form the convex surface of the arch,
while helices H1 and H5 form the concave surface. An
elaborate interaction network of aromatic residues
spans the core of the swapped dimer. These residues in-
clude Phe507, Tyr509, Tyr531, Trp533, Phe534, Phe538,
Tyr540, Phe541, and Trp548 from both dimers.
Figure 3
Structure of FOXP2 Bound to DNA
Figure 4. Biochemical Analysis and Functional Implication of Do-
main Swapping
(A) Superposition of the domain-swapped FOXP2 dimer (cyan) and
the monomer (yellow) showing the region (blue) that undergoes sig-
nificant backbone conformational changes. This region corre-
sponds to residues 536–548 in human FOXP2, indicated by a dashed
line in
Figure 1
(B) Multiangle light scattering (MALS) analysis of the wild-type hu-
man FOXP2 (residues 503–584). The profile shows two discrete
peaks corresponding to monomer (13.8 KD) and dimer (26.2 KD).
Blue lines: refractive index signal profile: MALS measurement of
mass at that point of the elution. The center of the peak gives the
greatest signal-to-noise for the measurement of mass.
(C) MALS analysis of the Ala539Pro mutant of human FOXP2 (resi-
dues 503–584) showing a single monodispersive peak correspond-
ing to monomer (14.1 KD).
(D) Electrostatic surface potential of the domain-swapped FOXP2 di-
mer (left). The DNA binding helix (H3) of one monomer in the domain-
swapped dimer inserts into the DNA major groove in a similar manner
to that seen in the monomer/DNA complex (shown on the right for
comparison), although its interaction with DNA is loose due to the
noncognate DNA sequence (not shown). The other monomer can
presumably bind a separate DNA substrate (
Figure 5
A) and the rela-
tively positive surface potential (blue) on top of the arch may facilitate
the binding of two strands of DNA (
Figure 5
Figure 5. A Specialized Function of FOXP Proteins May Be to Pro-
mote the Assembly of Higher Order Protein/DNA Complexes
(A) A model of the domain-swapped FOXP dimer (cyan and orange)
bound to two separated DNA sites (top view).
(B) A side view of the model. The leucine zipper (LZ) preceding the
forkhead domain, which may facilitate the formation of the domain-
swapped dimer, is also shown as a cylinder. In this view, the back-
bone of the two strands of DNA are closer on the top of the arch-
shaped dimer, where the protein has a relatively positive surface
potential (see
Figure 4
(C) Proposed roles of the FOXP dimer in DNA looping (left) and inter-
chromosomal interaction (right).
By contrast, the Ala539Pro mutant of FOXP2 exists
exclusively as a monomer in solution (
Figure 4
C). The
mechanism by which this single amino acid change pre-
vents swapping in classical FOX proteins appears to
arise from proline’s extraordinary propensity to disrupt
a helices (
Pace and Scholtz, 1998
). Fortuitously, we
have found that the optimal molar ratio of protein to
DNA is 3:1 for crystallizing the FOXP2/DNA complex, al-
lowing us to observe both the monomer- and domain-
swapped dimer in the crystal.
The N termini of the FOXP2 forkhead domain in the
swapped dimer are close to each other (
Figure 3
B). In-
terestingly, FOXP proteins contain a highly conserved
zinc finger/leucine zipper motif about 50 residues N-
terminal to the forkhead domain. This motif has been
shown to mediate dimerization of FOXP proteins (
et al., 2004; Wang et al., 2003
). In the full-length protein,
this zinc finger/leucine zipper may cooperate with the
forkhead domain to facilitate the formation of domain-
swapped dimers by FOXP proteins under physiological
concentrations. However, we cannot rule out the possi-
bility that the FOXP2 forkhead domain may also act as
a monomer to bind DNA in vivo. The thermodynamics
and functional implication of this monomer/dimer equi-
librium by the FOXP2 forkhead domain remain to be in-
vestigated. The two H3 helices in the swapped FOXP2
dimer are separated sufficiently to allow both copies of
FOXP2 to bind DNA simultaneously. However, because
the two DNA binding surfaces are connected by a rigid
protein domain characterized by extensive aromatic
interactions (see above), the DNA binding sites of the
domain-swapped FOXP2 dimer would need to be well-
separated from each other or from separate DNA strands
Figure 5
). Based on this structural feature, we propose
that a unique function of the FOXP family of proteins
is to loop DNA and/or mediate interchromosomal asso-
ciations. Consistent with this proposed role, the con-
vex surface is enriched in basic residues and has an
overall positive electrostatic surface potential, which
[ Pobierz całość w formacie PDF ]