SQUARE Background

What the method can do

This server produces a measure of per residue reliability for alignments between query sequences and sequences of known structure (ie those from the PDB). In addition, where possible, it adds extra information to aid in the evaluation of alignments, such as the positioning of known binding sites, known template secondary structure and regions with low evolutionary information.

The server can use alignments generated by fold recognition methods, by sequence profile-based methods, by pairwise methods, by homology modelling and even by multiple alignment methods (as long as only two sequences are entered in the window).


What the method cannot do

The server predicts the reliability of alignments based on the conservation/non-conservation of residues positions in sequence alignments. Though our results show that this reliability is transferable to structural alignments in most cases, the fact that residues are shown to be reliable at the sequence level does not mean that they are always reliable at the structural level.

If aligned residues are shown to be unreliable, the method cannot distinguish whether this is because the alignment is in this region is poor or it is because this region simply has low conservation. This is something that you will have to test yourself perhaps based on the other information in the window and by shifting the alignment by hand.

Occasionally parts (or all) of the alignment scores may be tagged as "unreliable". This is because the chain that is being aligned to your target sequence has only a few very close relatives in the sequence databases at these residue positions. With little evolutionary variation to go on, only those sequences that are closely related to the structural template will score well.


Extra Features

Since the reason for creating the server was to produce a web-based alignment evaluation method, we decided to add further features that might be of interest or use in evaluating the reliability (or not) of an alignment. These features are optimal alignments, secondary structure, conserved residues and sites of functional importance. Scores for these features are calculated automatically as long as enough information exists for their calculation. The server also produces 3D structures templates colour-coded by the SQUARE reliability score and CA traces of the predicted model structure. Details for all these features are included in the score interpretation page.


Methods

Profiles generated by PSI-BLAST (Altschul, S.R., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J., (1997) Nucleic Acids Res., 25, 3389-3402) and IMPALA (Schaeffer, A.A., Wolf, Y.I., Ponting, C.P. Koonin, E.V., Aravind, L. and Altschul, S.F. (1999) Bioinformatics, 15, 1000-1011) for chains from the PDB structural database (Berman, H.M., Westbrook, J., Feng, Z., Gilliland, F., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.J. (2000) Nucleic Acids Res., 28, 235-242). Each position in each alignment is evaluated over a five residue window.

When updated the server will contain all the PDB chains over 20 residues in length. However, the server database will only be updated every three months or so. ie less often than the PDB itself.

Most chains in the PDB have sequences that are identical to sequences from other PDB chains so, since this is an entirely sequence profile-based method, the scores, sites and secondary structure may be based on a different (but sequence identical) chain.



Reference:

[1] Predicting Reliable Regions in Protein Alignments from Sequence Profiles Tress, M., Jones, D. and Valencia, A. (2003).
J Mol Biol. 2003 Jul 18;330(4):705-18
DOI:10.1016/s0022-2836(03)00622-3

[2] SQUARE-determining reliable regions in sequence alignments Tress, M., GraƱa, O. and Valencia, A. (2004).
Bioinformatics. 2004 Apr 12;20(6):974-5
DOI:10.1093/bioinformatics/bth032


Technical support: dcerdan@cnio.es
Scientific support: mtress@cnio.es