Contact capacity potentials

Because of the computational problems a different set of potential functions has been proposed that allows for more efficient optimization algorithms.

a) Elastase b) Trypsin a) Elastase b) Trypsin

This figure shows that two proteins that have very similar structures can have widely differing chemical makeup, a) shows a full-atom model of porcine elastase (lest.pdb, 240 amino acids). The atoms are colored according to the chemical composition of the respective residue (red = positively charged, blue = negatively charged, yellow = polar, white = nonpolar). b) shows the same picture of bovine trypsin (ltpp.pdb, 223 amino acids). Elastase and trypsin are both serine proteases and have a sequence identity of about 36% and 39%, respectively, c) shows a superposition of the backbones of elastase and trypsin, revealing that their structures are practically identical (222 superposed residues with an rms of 2.6.9). Still, the chemical composition revealed by the coloration of a) and b) is quite different.

c) Superposition of the backbones of elastase (red) and trypsin (blue)

Such potentials are sometimes called profiles, as the structure is abstracted into a linear representation (the profile) of positional score values for the 20 amino acids at the respective position. A profile does not explicitly score the energy of pair-interactions in the resulting protein structure. Rather the idea of the profile is based on the hypothesis, that the protein target sequence inherits not only the backbone conformation from the template structure but also the chemical composition. Then, e.g. we can assume hydrophobic patches in the target protein where they occur in the template protein, H-bond donor regions are preserved as are polar regions etc. (see Figure 6.2b) If we assume this hypothesis, then we have to score not pairs of amino acids but single amino acids, each against a given and constant chemical environment. Aligning, say, a residue ai in the target sequence against a residue bj in the template sequence would be scored according to how often a residue of type ai is found within the chemical environment that the template protein structure presents for residue bj (see Figure 6.2, page 252). The alignment optimization problem now becomes much simpler. Essentially, the sequence alignment methods described in Chapter 2 can be used. We only have to replace the amino-acid substitution matrix used for scoring in sequence alignment by the threading profile. Thus alignment optimization can be done with dynamic programming in polynomial time (see Section 6.3.2). In practice, in this case several optimal alignments can be computed per second on a workstation.

Of course, the hypothesis that the chemical composition be identical in the template and the target protein, is often false (see Figure 6.3). Thus one has to apply both kinds of potential functions in concert to make suitable structure predictions (see Section 6.3).

Bowie et al. first introduced profile scoring functions for protein threading [136]. They consider the type of residue and geometric aspects of its environment such as inside/outside or the secondary structure element in which the residue occurs. Later different variants of this scheme have been developed. They essentially differ in how they describe the chemical environment of the residue in the template structure. Contact capacity potentials (CCPs) are a variant used in the threader 123D (Alexandrov et al [137]). CCPs essentially count contacts to neighboring residues, irrespective of the type of residue. The 2D structure environment is considered in a version of CCPs and contacts are distinguished with respect to whether they involve residues that are close (short range) or far apart (long range) in the protein sequence. Effectively, contact capacity potentials aim at approximating generalized hydrophobicity measures and secondary structure preferences, at the same time.

Continue reading here: Contacts in protein structures

Was this article helpful?

0 -1