Protein disorder and autoinhibition: The role of multivalency and effective concentration

Autoinhibition is a common method of self-regulation in proteins, where two or more domains interact to modulate the equilibrium between active and inactive states, and which can be facilitated by binding to an external partner, post-translational modifications (PTMs), or changes in environmental conditions [1,2]. The effective coordination of domains that contribute to an autoinhibitory mechanism is essential for fine-tuning protein function. Mutations that disrupt autoinhibition can lead to a constitutively active state that is associated with disease [3], with an early, well-documented example in the tyrosine-kinase Src (c-Src), in which disrupted autoinhibition is oncogenic [4]. Autoinhibition is notoriously difficult to predict and is often discovered incidentally during truncation studies, insertional mutagenesis, or genetic screens [5,6].

Intrinsically disordered regions (IDRs) commonly participate in autoinhibition, including the regulation of oligomerization, catalysis, binding affinity, and binding specificity, as well as the induction of phase separation [7, 8, 9]. The role of IDRs in autoinhibition often depends on binding to an ordered domain that may result in direct competition with another binding partner or induce allosteric changes distal to the binding site [4,7,10]. The binding of IDRs to ordered domains can involve a combination of features like coupled folding and binding (CFB), fuzzy binding, multivalency, allovalency, avidity, and effective concentration that result in a complex matrix of effects on affinity, specificity, and function (see Glossary in the supplement for definitions of uncommon terms) [11,12]. Common IDR properties such as low sequence complexity, relatively fast evolution [13], a spectrum of possible secondary structures in the bound state [14], and the inherent difficulties in structurally modeling them make predicting and characterizing their role in autoinhibition challenging. Since IDRs make up 30–50% of the eukaryotic proteome, we suspect that IDR-mediated regulation of protein function through autoinhibition is very common, especially in nucleic acid binding proteins where disorder is enriched [15] and can dramatically influence binding specificity [16].

In this review, we discuss some recent examples of autoinhibition regulated by IDRs and describe how multivalency and effective concentration are key features. We also discuss methods used to identify and characterize autoinhibitory IDRs and how new tools like AlphaFold2 will facilitate discovery of IDR-mediated protein autoinhibition.

The evolutionary pressures on IDRs are dependent on the function and binding mode. Structural constraints are weaker than for ordered domains; therefore, IDRs may be highly conserved in length, backbone flexibility, and residue type but still have low sequence conservation, with exceptions at sites that facilitate inter- or intramolecular interactions, including short-linear motifs (SLiMs) that directly regulate IDR-mediated protein activity, SLiM-flanking regions that modulate IDR specificity, and PTM sites [11,17,18]. Based on sequence modeling, SLiMs are abundant with over a million predicted peptide motifs; however, the Eukaryotic Linear Motif database only contains a few thousand curated entries, highlighting the challenges of SLiM identification and validation [18,19].

Box 1 shows a collection of proteins, see Footnote 1 for abbreviations,1 where autoinhibition depends on the interaction between an IDR and an ordered domain, illustrating the variability in sequence conservation, binding modes, and functions of autoinhibitory IDRs. The multiple sequence alignments (MSA) in Supplemental Figure 1 show the levels of sequence conservation for these proteins, where some autoinhibitory IDRs are well conserved among vertebrates (c-Src, ETS1) [4,20], or recognizable even in non-metazoans (U2AF2, Mcm4, HMGB1) [21, 22, 23]; alternatively, the autoinhibitory function may be conserved despite low sequence similarity (TFB2M) [24] or conserved only in some species (immune receptor RIG-I) [25]. When there are enough sequences and conservation is high direct coupling analysis can be used to identify covariance between an IDR and an ordered domain [26, 27, 28].

Autoinhibition by disordered motifs provides a flexible platform for natural selection where binding can be mediated by variable combinations of hydrophobic, polar, and/or ionic interactions. An IDR can bind an ordered domain along the entire spectrum of disorder to order. This includes relatively static structures that fold upon binding (Rb) [29] (Box 1). There are also examples where there is partial folding upon binding (Mcm4, UHRF1, MDMX) [22,30,31] and cases where the bound state forms an ensemble of dynamic, multivalent structures that remain fuzzy in the bound state (p53, HMGB1) [32,33] (Box 1). In many cases, motifs that fold into static structures will form amphipathic helices with extensive burial of hydrophobic surface area (Rb) [29], while fuzzy interactions are often mediated by the side chains of charged residues (HMGB1) [33] (Box 1). The autoinhibitory motifs for the proteins listed in Box 1 have variable sequence lengths between 6 and 12 residues, but flanking regions are known to play a role for several examples and sequence length is not well correlated with the binding mechanism (i.e., CFB vs fuzzy) [11].

Autoinhibition may occur through direct competition, allosteric interactions, or a combination of both (ETS1, Rb) [20,29]. In all cases, a specific function is impaired, leading to outcomes that are desirable for regulation, such as a reduction in nucleic acid binding that ultimately increases target specificity, (ETS1, U2AF2, TFB2M, RIG-I, p53, FOXO4) [20,21,24,25,32,34,35] or increases target DNA search speed (HMGB1) [33]. Autoinhibitory IDRs are also implicated in the regulation of helicase activity (Mcm4) [22], reduction in protein-protein binding (MDMX, Rb, RBBP1, UHRF1) [29, 30, 31,36] that can be alleviated by PTMs, inhibition of phase separation (Tau, HP1α) [8,9], and inhibition of enzymatic functions in kinases and phospholipases (c-Src, PLCβ3) [4,37].

The tethering of autoinhibitory SLiMs to ordered domains by a flexible linker will modulate the strength of the intramolecular interaction [16] and the effective concentration is determined by properties like sequence length, sequence patterning, and charge distribution, [38]. Domain organization also plays an important role in the regulation of autoinhibition. Figure 1 shows a sampling of domain maps and MSA observed for proteins with autoinhibitory IDRs. The autoinhibitory IDRs described here all have relatively weak binding affinities for their ordered partners that decrease binding to an external partner by 10 to >400-fold.

Shown in Figure 1a, DNA binding of the Forkhead domain (FH) of Forkhead Box O4 protein (FOXO4) is inhibited by an interaction with its disordered CR3 domain, which forms a helix upon binding that contains several charged and hydrophobic contacts [34] and whose sequence is well-conserved among vertebrates and FOXO paralogs [39]. The CR3-FH KD is in the micromolar range and effective concentration of CR3 is mediated by a long, disordered C-terminal tail separating it from FH [34]. Autoinhibition is relieved upon interaction of both domains with β-catenin or target DNA.

Shown in Figure 1b, two disordered subdomains of the N-terminal region of the p53 tumor suppressor participate in autoinhibition of the DNA binding domain (DBD). The second transactivation domain (TAD2) competes with DNA for binding to a pocket on DBD via a poorly conserved combination of charge-based and specific interactions and is tuned by phosphorylation [32,40,41]. The TAD2-DBD KD is in the millimolar range, yet the short distance separating the domains results in a high effective concentration. The relatively rigid proline-rich region (PRR) interacts with DBD, restricting the spatial orientation of TAD2 and reducing the local concentration of TAD2 at the DBD binding site. This behavior is analogous to the frustrated energy landscapes proposed for protein folding [32,41,42].

In many cases, the disordered motif has a weak binding affinity for the ordered domain that is enhanced by the presence of multiple motifs (allovalency) or by the length of a flexible linker (avidity) [11,38]. Both effects contribute to autoinhibition of p53 binding for the Double Minute 4 protein (MDMX) [31]. Shown in Figure 1c, two SLiMs in MDMX, referred to as WW and WF, exhibit allovalency to bind and inhibit the p53 binding domain (p53BD) [31]. The length and sequence properties of the MDMX linker between p53BD and the SLiMs optimizes autoinhibition [31,38]. Autoinhibition is relieved by Ck1α phosphorylation of residue S289, which makes the WW and WF motifs available to inhibit p53 DNA binding [43]. MSA of the WW motif shows good conservation of the SLiMs, while sequences flanking the motifs exhibit lower conservation.

Figure 1d shows the multidomain autoinhibition of the Retinoblastoma protein (Rb). The disordered Rb pocket domain loop (RbPL) interacts with a binding site created by Pocket A and Pocket B domains, directly competing with E2F, while the linker from the independently folded domain of Rb (RbIDL) allosterically regulates E2F binding by collapsing the N-terminal domain B (RbNB) onto Pocket A and disrupting the binding site [29]. Both sites are regulated by Cdk phosphorylation. The KD of these interactions are not known; however, the structure of a shortened RbPL in complex with Pocket domains A and B was determined by X-ray crystallography. MSA of the RbPL domain shows good conservation at the interaction site.

IDR-mediated autoinhibition may involve different combinations of ordered and disordered domains, as shown in Figure 2. The interaction between the disordered motif and the ordered domain can lead to a smaller than expected hydrodynamic radius [31,40]. This sort of intramolecular collapse seems common but is difficult to model. Figure 2a shows two IDRs that interact and result in intramolecular collapse of the elongated protein into a more compact state. In the K18 region of Tau, four adjacent disordered motifs have fuzzy interactions which lead to a more collapsed state than expected for an IDR of the same length without intramolecular interactions, and this state inhibits the transition to an elongated form that initiates phase separation (Box 1) [8]. In the measles virus nucleoprotein, CFB to an ordered partner is disfavored by the disordered C-terminus that destabilizes the bound motif, favoring the elongated autoinhibited state [44].

A single IDR may inhibit a single ordered domain via an N- or C-terminal tail, shown in Figure 2b (FOXO4, Mcm4) [22,34]. Multiple ordered domains can form a pocket that is inhibited directly (HMGB1, TFB2M, Rb RbPL, UHRF1) or allosterically (RbIDL) by an IDR [24,29,30,33], shown in Figure 2c. An IDR that joins two ordered domains can facilitate intramolecular collapse of the protein. In these cases, a disordered linker that separates the ordered domains inherently affects their spatial orientation [45] and thus regulates their association and, potentially, frustration [46]. In some cases, inhibition occurs by occluding a binding pocket or by charge-based competition with external binding partners (RIG-I, U2AF2, PLCβ3) [21,25,37].

Figure 2d shows autoinhibition mediated by multiple adjacent SLiMs or IDRs that bind an ordered domain at individual sites (avidity) or at the same site (allovalency). Multivalency of adjacent SLiMs may enhance autoinhibition as the binding of one motif to the ordered domain functions as a tether that increases effective concentration of the other motif [11]. Avidity can result in increased autoinhibition of a protein where two SLiMs separated by a short linker bind to adjacent sites on an ordered domain (ETS1) [20]. SLiM flanking regions can also stabilize autoinhibition, such as the N-terminal regions of the WW motif in MDMX [31]. MDMX also exhibits allovalency, where the WW and WF motifs compete for the same site on p53BD [31]. Adjacent SLiMs can also regulate autoinhibition, which is observed for p53 where the PRR reduces autoinhibition of DNA binding by decreasing the ability of TAD2 to bind DBD, presumably projecting TAD2 away from DBD [41].

Flanking IDRs on either side of an ordered domain, Figure 2e, can have similar effects as adjacent IDRs. As with adjacent IDRs, multiple, nonadjacent autoinhibitory elements may cooperate, compete, or frustrate one another (ETS1, MeCP2) [20,47]. Multi-domain interactions, Figure 2f, are common among proteins that are autoinhibited by IDRs and may occur with different domain combinations based on PTMs or environmental conditions (Rb, RBBP1, Vav1, HP1α) [20,29,36,48]. Interactions between ordered domains separated by a flexible linker can be disrupted by changing the pattern of charged residues in the linker and increasing or reducing the length of the linker, reducing the effective concentration [30,37,49].

Autoinhibitory IDRs are challenging to characterize, especially when they exhibit conformational heterogeneity in the bound state. Therefore, multiple biophysical techniques are used to assess affinity, size, structure, and dynamics of autoinhibitory IDRs. IDR-mediated autoinhibition can result in a smaller than expected Stokes radius (RH) due to intramolecular collapse of the elongated IDR [50, 51, 52]. In these cases, autoinhibition can be revealed by determining the RH of wild-type and mutant/truncated proteins with dynamic light scattering (MDMX and MeCP2) [31,47], or size-exclusion chromatography (p53 and PAKs) [40,53]. However, predicting RH for proteins with a mixture of ordered and disordered domains is challenging, which makes comparative studies of multiple mutants essential [52]. Small-angle X-ray scattering (SAXS) is another important technique that provides size information for proteins with a mixture of ordered and disordered regions. Combining SAXS with computational tools like the ensemble optimization method (EOM) provides a structural model of conformational heterogeneity [54] and can be used to identify how IDR flexibility regulates protein function (SMAD4 and SMAD2) [55,56]. Intramolecular interactions involved in collapse and autoinhibition can be confirmed by time-resolved Förster resonance energy transfer (FRET), as was used for FOXO4 [34].

To develop a comprehensive mechanism of single-site or multivalent autoinhibition nuclear magnetic resonance (NMR) spectroscopy can be combined with isothermal titration calorimetry (ITC). NMR spectroscopy can be used to identify the residues directly involved in intramolecular interactions and their associated exchange regimes [57,58]. Using NMR spectroscopy, the disordered transactivation domain, CR3, of FOXO4 was shown to contact residues in the FH DNA binding pocket and change the FH exchange regime for non-target DNA without impacting the slow-exchange regime for target DNA (Figure 1a) [59,60]. NMR can also identify individual SLiMs involved in complex multivalent interactions with a single ordered domain and distinguish their individual binding modes, as with the three MDMX AD SLiMs, WW, and WF, binding to p53BD [31,48]. ITC can measure binding thermodynamics to complement NMR data, as with FOXO4 and MDMX. The fast-exchange regime of FOXO4 is suggested by ITC to be important for the autoinhibition of CR3 transactivation activity until FOXO4 finds its promotor [34,59]. In MDMX, the WW and WF motifs use allovalency to bind the p53BD primary pocket while a disordered segment, flanking the N-terminal of WW, uses avidity to bind the p53BD secondary pocket [31]. Fluorescence anisotropy is used to assess binding where conditions make other binding assays impractical, such as in conditions of very low or high salt and temperature and at the extremes of low and high binding affinity, as with p53 TAD2 autoinhibition of DNA binding [35,40].

X-ray crystallography or cryo-EM may be used to visualize CFB when the bound structure is relatively static (Rb, Mcm4) [22,29]. However, autoinhibitory IDRs often form dynamic, multivalent structures that require techniques like NMR spectroscopy or single-molecule FRET to analyze the ensemble of structures (HMGB1, UHRF1, p53) [30,32,33]. NMR spectroscopy of protein backbone residues in the apo state or partner-bound state can be used to analyze residue-specific transient secondary structure [61]. Circular dichroism (CD) can measure the CFB structure of SLiMs when they are not observable by NMR or X-ray crystallography [52,62], as in the case of autoinhibited MDMX and MeCP2 [31,47]. High-speed atomic force microscopy can directly visualize dynamic IDRs at low resolution [63,64].

Molecular dynamic simulations can provide support for experimental findings and can generate ensembles that are compared to experimental results [65]. Recent advances in machine learning-based predictions of protein structure offer promising insights for studies of autoinhibition mediated by IDRs. Shown in Figure 3a and b, AlphaFold2 accurately predicts transient secondary structure at the known CFB SLiMs of CR3 in FOXO4 and TAD2 in p53 with low confidence scores thought to be associated with disordered regions [27,28]; however, it does not predict the intramolecular docking of these regions [34,59].

For MDMX, AlphaFold2 predicts the multivalency of WW motif and the N-terminal flanking region for p53BD which mediates a known CFB event responsible for autoinhibition (Figure 3c) [27,28,31]. For Rb, Alphafold2 predicts the CFB of RbPL to Pocket A and Pocket B; however, it does not predict the RbIDL interaction with Pocket A and RbNB (Figure 3d) [27,28]. Limitations of AlphaFold2 predictions for IDR-mediated autoinhibition may be dependent on SLiM binding properties and IDR-mediated interactions inferred from the AlphaFold2 training datasets.

AlphaFold2 was also used to show shared structural dynamics between CR3 sequences of FOXO1 and FOXO6 with FOXO4, suggesting autoinhibitory elements in unexplored paralogs [39] despite not predicting the intramolecular interaction of FH and CR3 (Figure 3a). In predicting multivalency and CFB in the MDMX WW motif for p53BD, AlphaFold2 agrees with our previously published CD data showing an increase in helicity of bound the WW motif and NMR data suggesting a second SLiM, N-terminally flanking WW, binds a discrete 2° binding pocket on p53BD, as shown in Supplemental Figure 2a [27,28,31].

Investigations into the conservation of the flanking N-terminal WW region using MSA and AlphaFold2 suggest that amino acid sequence, transient helicity, and CFB propensity remains conserved throughout multiple species lineages (Figure 1c and Supplemental Figure 2) [27,28]. The MDMX AD autoinhibitory elements have been observed in human and mouse MDMX while various other species have only been shown to exhibit similar in vivo roles in regulating p53 by MDMX [66], suggesting the conserved intramolecular binding properties of the WW motif to p53BD predicted by AlphaFold2 may be indicative of conserved autoinhibitory functions.

AlphaFold2 is currently able to create confident predictions for >50% of the human proteome but a low percentage of these predictions represent IDRs. Increasing the propensity and confidence of AlphaFold2 predictions of IDR-mediated autoinhibition will depend on novel training data, including structures of the intramolecular complexes from either experiment or simulation.

Comments (0)

No login
gif