University of Central Florida Undergraduate Research Journal - Computational Analysis of Broad Complex Zinc-Finger Transcription Factors
US tab

Computational Analysis of Broad Complex
Zinc-Finger Transcription Factors

By: Barbara Mascareno-Shaw
Mentor: Dr. Thomas Selby


Sequence comparison of Z1/Z3 and Z2/Z4 proteins with their respective templates.

Analysis of the energetic differences between the mutant proteins, and the homology model required the construction of the model based on a crystal structure with the highest sequence homology.  A similar approach was taken to analyze different mutants (if more than one was known), but in this case only a single acid needed to be changed.  This approach simplified the calculations, and provided a measure of the energy change due to a single amino acid change.   

The homology alignment differed between structures in the binding region for zinc atoms, which became the focus of this investigation.  The Z1 model sequence was aligned with the 2DRP template (after searching the protein databank for crystal structures with the highest homology) as shown in Figure 3.   Statistics for the Z1 alignment to 2DRP are:  Length: 58, Score: 32.7278 bits (73), E-value: 0.035563, Identities: 14/58 (24%), Positives: 26/58 (45%), and Gaps: 0/58 (0%).  It is evident that the cysteine and histidine residues (shown in pink) are conserved in all three proteins (two mutants and the 2DRP template).  Arginine (shown in green in Figure 3A) is the naturally occurring amino acid at position 57 in the model that is substituted for cysteine (shown in red in Figure 3A) in the mutated protein at the same position.

 For the Z3 structure, 2 mutants were constructed based on sequences provided by the von Kalm lab.  Z3 statistics are as follows: Length: 52, Score: 40.4318 bits (93), E-value: 1.70555E-4, Identities: 14/52 (27%), Positives: 28/52 (54%), and Gaps: 0/52 (0%).  The first mutation involves substitution of serine 55 (shown in green in Figure 3B) with leucine (shown in red in Figure 3B).  The second mutation is a change from threonine 58 (shown in green in Figure 3B) to methionine (shown in red in Figure 3B). 

Figure 4 shows the alignment of the 1A1H sequence with the Z2 and Z4 proteins.  There are no mutants reported for the Z2 and Z4 proteins, which have identical sequences in the zinc- finger DNA binding region.  Z2/Z4 statistics were as follows:  Length: 54, Score: 30.0314 bits (66), E-value: 0.231784, Identities: 14/54 (26%), Positives: 25/54 (46%), and Gaps: 3/54 (6%).  The Z2 and Z4 proteins are not homologous outside the DNA binding region, which allows them to function differently.   

However, we did want to investigate the interaction energy differences between the template and the Z2/Z4 proteins based on a tyrosine amino acid that is in a position where a conserved histidine normally resides.  Although this interaction is not a mutation (and is known to occur in these types of proteins across a number of species), the energetic differences are of interest because they change the typical zinc binding residues from a 2H/2C system to a 2H/1C/1Y system.  Additionally, constructing two independent models from a single sequence allowed a statistical measure of the error in our methods.  For this reason, the Z2 and Z4 sequences, although identical in amino acid content, will be discussed as two different proteins.  

Figure 5Interaction energy analysis of the 2DRP template, Z1, and Z3 mutant proteins.  

The energy minimization calculations are based on the difference in energy between the homology model and the mutant with the appropriate amino acid substitution.  However, before analyzing these values, interaction energy calculations were performed on the template and model before analysis of the mutants. 

As shown in Table 1, the interaction energy for the 2DRP structure with the DNA is -2963 kJ/mol.  The electrostatic energy is very high due to the negative charges on the DNA backbone, as well as the positive charges on the amino acids in the protein.  The hydrogen bonding energy is relatively low, at -71 kJ/mol, which reflects the lower number of hydrogen bond interactions made with the DNA, relative to charge-charge interactions. 

Following construction and energy analysis of the Z1 model, the interaction energy was found to be -2761 kJ/mol.  This measurement is in good agreement with the crystal structure, but shows a loss of ~200 kJ/mol of energy relative to the template.  This difference in energy is most likely based on the method of model building that was used, but a 7% energy difference between the model and template indicates that the model is very reliable.  It is interesting to note that the hydrogen bonding for the Z1 model shows an increase in stabilization (-286 kJ/mol) relative to the template structure, while the van der Waals and electrostatic energies are roughly the same. 

The Z3 model was constructed and analyzed in an identical manner to the Z1 model.  Interaction energy analysis shows that the binding is similar to the template, with an energy difference of roughly 30%.  This value of -2036 kJ/mol is a bit more unstable than the Z1 model and does not show an increase in hydrogen bonding stability.  The electrostatic, van der Waals, and hydrogen bonding are similar to the 2DRP template structure, although the slight differences of each of these appear to contribute to the overall drop in stabilization energy relative to the Z1 and 2DRP structures. 

Analysis of the mutations showed that the most significant energy difference is found between the Z1 model and the Z1 mutant where arginine 57 is replaced by a cysteine as shown in Table 1. The interaction energy in this case is found to be -5363 kJ/mol.  This high energy value represents an increase of over 90% in stabilization energy relative to the Z1 model, and shows the critical importance of this amino acid position within the zinc-finger protein.  The hydrogen bonding and van der Waals interaction energies appear to contribute the most to the change, relative to the model, with the electrostatic interaction values remaining relatively the same.  The point of contact with the DNA is shown in Figure 5, the model is shown in panel A, and the mutant is shown in panel B.  The arginine (shown in blue) is an extended residue that can interact well with the DNA through both hydrogen bonding and electrostatic forces (due to the positive charge on the side chain), whereas cysteine (shown in red) is a shorter residue that can only interact with DNA through polar and hydrogen bonding interactions.  It is also important to point out that the change from a bulky arginine side chain to a smaller one, such as cysteine, allows other amino acids to reposition themselves as well.  Serine 161 is moved into a position closer to the DNA (which helps account for the increase in hydrogen bonding energy) due to the removal of the arginine side chain, which was blocking the interaction in the wild type structure.  This demonstrates that a single change at one amino acid position can produce effects outside of that region and result in unanticipated increases in interaction energy. 


Interaction Energy














2DRP Crystal Structure




















Hydrogen Bonding















Z1 Model (based on 2DRP)




















Hydrogen Bonding















Z3 Model (based on 2DRP)




















Hydrogen Bonding



































Hydrogen Bonding



































Hydrogen Bonding



































Hydrogen Bonding





Table 1.   Interaction energy values for the template (2DRP), Z1 model, Z3 model, and their respective mutations. Comparison of the electrostatic, Van der Waals (VDW), and hydrogen bonding energies for the template, Z1 model, Z3 model, and their respective mutants. Each structure provides interaction energies (kJ/mol), and a difference between the interaction energy and the complex, the DNA as a substrate by itself, and the protein as the binding identity by itself.

Figure 6Figure 6The first Z3 mutation involves the replacement of serine 55 with a leucine.  This change results in very little energy differences relative to the Z3 structure.  The overall interaction energy is -2047 kJ/mol, which is less than 1% different than the Z3 model interaction energy of -2038 kJ/mol (Table 1).  When the wild type and the mutant are analyzed side-by-side as seen in Figure 6, it is clear that the leucine residue does not contribute significantly to the binding of the DNA strand.  This position is outside of the DNA interacting region, and the amino acid at this position does not appear to be critical for DNA binding and/or recognition.  Additionally, the electrostatic, van der Waals, and hydrogen bonding energies are almost identical, demonstrating that the effect of this mutation in vivo is not due to DNA binding, but may be associated with other factors as discussed further in the Conclusions section. 

The second mutation of Z3, involving the substitution of threonine 58 to methionine, had an overall stabilizing effect, but the overall interaction energy increase was not as significant as observed in the Z1 mutation.  This change increased the stabilization of the DNA interaction by 553 kJ/mol, as shown in Table 1.  The electrostatic energy changed slightly as well, but the van der Waals energy was roughly the same.  This change of 27% relative to the Z3 model cannot be justified by the type of side chain substitution that was made.  Both the threonine and methionine side chains are capable of hydrogen bonding interactions (although the methionine sulfhyryl group is a weaker hydrogen bond donor compared to the more polar threonine hydroxyl group), and are roughly the same size, reducing the likelihood that the changes are due to neighboring side chains being repositioned (as observed for the Z1:R57C mutation).  The structures are shown in Figure 7.  However, there is a clear increase in the hydrogen bonding stabilization relative to the model.  The possible difference in this case could be due to the relative hydrogen bonding energy of the proteins alone (without DNA bound).  Note that in the case of the Z3 protein (E-Protein, Table 1), the hydrogen bonding energy for the Z3 model is -327.7 kJ/mol compared to the Z3:T58M value of -249.7 kJ/mol.  This measurement demonstrates that the Z3 wild type has better hydrogen bonding with itself when it is in the unbound state, and these hydrogen bonds are lost upon DNA binding.  The hydrogen bonding of the T58M mutant does not show the same level of energy difference (-249.7 kJ/mol), indicating that it does not use hydrogen bonding as effectively in the unbound state.  These types of differences demonstrate the balance between the free and bound forms, and indicate that some values may be slightly inflated due to more stability in the free form as compared to the bound form.  This approach also demonstrates how both structural and computational information should be used to determine the significance of the energy values.      

Z2/Z4 analysis

As shown in Table 2, the interaction energy for the experimentally determined crystal structure was found to be -941 kJ/mol.  This is significantly lower than the 2DRP structure, but it is important to point out that zinc-finger proteins with little homology will often have different interaction energies.  This structure was determined by x-ray crystallography, so these energy values are much more reliable than those obtained through homology modeling. 

As mentioned previously, the Z2 and Z4 protein sequences are identical, but to validate our method we constructed two independent models.  These models utilized the same analytical method, but used different starting positions for all the atoms that were minimized.  Both the Z2 and Z4 energy levels were in excellent agreement with one another.  Although there were slight variations in the level of the particular types (electrostatic, van der Waals, and hydrogen bonding) of energies, the overall interaction energies were almost identical. 

1A1H is only 46% homologous to the Z2 and Z4 proteins (described as “positives” in the previous section).  This value is similar to the homology between Z1 and Z3, which had 45% and 54% homology with 2DRP, respectively.  What came as a surprise was the interaction energy difference of the Z2 and Z4 models with the 1A1H crystal structure.  Recall that the differences in interaction energy for Z1 and Z3 compared to the 2DRP structure were 7% and 30%, respectively.  These differences indicated that there was good agreement between the model and the template being used.  In the case of Z2/Z4, an energy difference of nearly 50% indicates that the template may not be suitable for modeling this sequence.  For the Z1 protein, the homology was 45%, and the energy difference (compared to the 2DRP template) was only 7%.  For Z3, the homology was 54%, and the difference in energy was roughly 30%, a correspondence which demonstrates that there is no correlation between differences in energy and homology.  This lack of correlation is corroborated with the Z2/Z4 sequence, which is 46% homologous, but shows a loss of nearly 50% of its interaction energy when compared to the model.  Based on these values,  it is difficult to state clearly the effect of having a tyrosine occupy a position where a histidine is typically located.  It is clear from the structural analysis (Figure 8) that the tyrosine side chain will have a different type of interaction compared to the imidazole group of histidine; however, without a more reliable template to model the sequence against, any further analysis would certainly contain significant errors. 

This study demonstrates that the Z2/Z4 protein would be an excellent target for experimental structure determination using x-ray crystallography or NMR, but should not be used for energy analysis.  At the minimum, a more suitable template would be needed to accurately determine the effects of any amino acid substitution on a protein.  


Interaction Energy









1A1H Crystal Structure




















Hydrogen Bonding















Z2 Model (based on 1A1H)




















Hydrogen Bonding















Z4 Model (based on 1A1H)




















Hydrogen Bonding





Table 2.  Interaction energy values for the template (1AH1), Z2 model, and Z4 model. Comparison of the electrostatic, Van der Waals (VDW), and hydrogen bonding energies for the template, Z2 and Z4 models. Each structure provides interaction energies (kJ/mol), and a difference between the interaction energy and the complex, the DNA as a substrate by itself, and the protein as the binding identity by itself.

Conclusions >>