Align-GVGD Method

The program Align-GVGD was used to classify as functional or non-functional, all possible missense substitutions in p53.

A-GVGD scores missense substitutions against the range of variation present at their position in a multiple sequence alignment. It has previously been applied to the tumor suppressor protein BRCA1 (Tavtigian, et al., 2005), and allowed the identification of 8 previously unclassified neutral mutants. The program uses multiple sequence alignments (MSA) and the Grantham matrix to determine the conservation of amino-acid residues in a protein. The Grantham matrix provides a measure of the biochemical distances between amino acids, according to their composition, polarity and volume (Grantham, 1974).

In the A-GVGD program, two different types of conservation scores are calculated: (1) Grantham Variation (GV); (2) Grantham Deviation (GD). Conceptually, all amino acids observed at a given position are plotted on a three-dimension graph, with their polarity, volume, and composition as coordinates and with different weights applied on the axes. This cloud of points points formed by the amino acid can be enclosed within a box (GV box), where the coordinates of the diagonal are the minimum and maximum values of C, P, V, for the observed amino acids. GV is computed as the Euclidian length of the main diagonal of the box. GV is thus a measure of the amount of observed biochemical variation in a particular position in the alignment. Next, the GD is calculated by plotting a given mutation on the polarity-volume-composition graph, and measuring the Euclidian distance between that mutation and the closest point on the GV box. If the substitution lies within the box, then GD = 0. Otherwise, GD is greater than 0. The GD is thus a measure of the biochemical difference between the mutant and the observed variation at that position according to the MSA.

To classify p53 missense mutations, the following GV/GD cutoff values were applied (please note that the AGVGDClass has been corrected as of 8 dec 2005):

If GD = 0 : the composition, polarity and volume of the mutant amino acid fall within the observed range of variation according to the alignment at that position, so the mutation is predicted as neutral;

Else :

  • if GV <= 61.3 : there is only a small variation in amino acids at a given position with residues that are biochemically similar, so any mutation at that position is predicted as deleterious;
  • if (GV > 61.3) and (0 < GD <= 61.3) : the position tolerates more than "conservative" substitution and the composition, polarity and volume of the mutant amino acid fall close to the observed range of variation according to the alignment at that position, so the mutation is predicted as neutral;
  • if the mutant does not fall in the previous categories, it is unclassified.

It is important to note that the accuracy of the predictions is highly dependent on the input MSA used to calculate GV and GD. The classifications presented in the IARC TP53 database are based on an MSA constructed with 3D-Coffee (, using the following 9 sequences: Homo sapiens (sp|P04637), Macaca mulatta (monkey, sp|P56424), Bos taurus (bovine, sp|P67939), Canis familiaris (dog, sp|Q29537), Mus musculus (mouse, sp|P02340), Rattus norvegicus (rat, sp|P10361), Gallus gallus (chicken, sp|P10360), Xenopus laevis (frog, tr|P53_XENOPUS), Brachydanio rerio (zebrafish, sp|P79734). Get the FASTA sequences here.

The x-ray solved structure of the DNA binding domain of human p53 (PDB [Berman, et al., 2000] id 1tsr, chain B) was also used to construct the MSA.


- Mathe E, Olivier M, Kato S, Ishioka C, Hainaut P, Tavtigian SV. 2006. Computational approaches for predicting the biological effect of p53 missense mutations: a comparison of three sequence analysis based methods. Nucleic Acids Res. 2006 Mar 6;34(5):1317-25.
- Tavtigian SV, Deffenbaugh AM, Yin L, Judkins T, Scholl T, Samollow PB, de Silva D, Zharkikh A, Thomas A. 2005. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. J Med Genet 42:138-146.
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. 2000. The Protein Data Bank. Nucleic Acids Res 28(1):235-42.
- Grantham R. 1974. Amino acid difference formula to help explain protein evolution. Science 185(4154):862-4.