firestar [1,2] EXTENDED Results Output

Target sequences are subjected to standard HHsearch[3] and PSI-BLAST[4] searches. Profiles are generated with an nrdb70 database from the EBI and the final search is made against the FireDB[5] consensus sequence database derived from PDB sequences in the case of PSI-BLAST, and against an ad hoc profile database generated starting from the previous one in the case of HHsearch.

Consensus sequences are identified by id's of the kind t_1tcoC where the chain "C" of the pdb file "1tco" is the representative of the cluster. Consensus sequences are built from clusters of sequences with a similarity of 97% sequence identity.

E-values can be modified by the user. The default E-value cutoff is set to 10. We have found [6] that in some cases local similarities are reliable enough to transfer binding site residues between even distant homologues.

Only those PSI-BLAST and HHsearch hits with any functional information annotated in FireDB are displayed. Each hit is represented in its own box. All boxes are displayed on the same page.

Each alignment is evaluated using a specially adapted version of SQUARE [7]. This method generates scores for each position in the alignment based on local conservation scores.

SQUARE produces reliability scores for all aligned residues, but it is important to note that these reliability scores refer solely to the viability of the transference of functionally important residues from the consensus sequence. The score is meaningless for those residues in the FireDB consensus sequence that do not have associated functionally important residues.

The functionally important residues from the FireDB consensus sequences are supported by evidence. In the case of catalytic residues the evidence comes from the Catalytic Site Atlas [8] and are annotated as "literature" or "inferred with PSI-BLAST". In the case of binding residues a percentage of occurrence is generated. The following table explains evidence and reliability scores in more detail:

SQUARE: table of scores  Binding site occurrence and catalytic residue evidence
Gapped or non-conserved position literatureCatalytic residue annotated from literature
45% reliability PSI-BLASTCatalytic residue inferred by similarity
60% reliability 
75% reliability 0-20% occurrence
4 85% reliability 20-50% occurrence
5 90% reliability 50-100% occurrence
C 99% reliability XX% occurrenceNot applicable
Note that sites are usually formed by groups of residues. The user should assess overall conservation based on all the residues important for one site. It is common to find that binding sites are partially conserved.   Occurrence is calculated when a given consensus sequence has several representatives in the PDB. It indicates the percentage of PDB representatives that bind ligand analogs at same site. Where there are less than 5 representatives the values are marked "not applicable".


PSI-BLAST and HHsearch output guide scheme


Structural alignment output

Structural alignments can be generated by clicking on "run LGA" in the yellow box next to each PSI-BLAST and HHsearch output.

Structural superpositions between the query and the representative of the selected cluster are generated with LGA [9] and open in a separate page that has a similar look to the pairwise results.

The superpositions can be visualised within the page with a Jmol applet. Note that this means that Java environment must be installed for the navigator. If the java environment is not installed for the browser the applet may cause the navigator to crash on some platforms. Unfortunately this technical issue is beyond our control.


Structural alignment output guide scheme



Multiple sequence alignment output

Multiple alignments between the consensus sequences and the query sequence can be generated by clicking on the "Display Multiple Sequence" button in the pink box in the PSI-BLAST or HHsearch output page. MUSCLE [10] will align the query sequence to all the consensus sequences found by the two programs.
The MSA output page allows selection of a subset of sequences to be realigned by checking in the checkboxes.

Residue colour scheme

Multiple sequence alignment with catalytic residues highlighted:
firestar makes use of the supporting evidence collected by the Catalytic Site Atlas.

The Catalytic Site Atlas has two types of supporting evidence:
1) evidence from curated literature references and
2) evidence inferred from PSI-BLAST.
Multiple sequence alignment with binding residues highlighted:
Residues are calculated from PDB structures; given the redundancy of the PDB it is possible to assess whether a given set of residues are participating in the same protein-ligand binding in several highly similar (more than 97% identity) structures. In firestar this is represented by "occurrence" and is only applicable when a cluster has 5 or more representatives. More information at FireDB.
Literature evidence
PSI-BLAST evidence
Other functional residues
100% occurrence
66-99% occurrence
33-66% occurrence
< 33% occurrence
Not applicable
Other functional residues

Multiple sequence alignment output guide scheme



firecat help page

Alignment reliability is most sensitive to alignment quality. This tool allows user to insert their own pairwise query-template alignments with the requirement that template must be a PDB sequence.

It requires the second sequence in the pairwise alignment to be a PDB sequence. In this way the server can identify the FireDB cluster to which the second sequence belongs and associate the functional information stored in FireDB.

This tool can be used when alignments can be improved, but also when firestar PSI-BLAST and HHsearch are unable to find homologues. In this case query-template alignments from fold recognition servers can be used.

example

The following alignment has been generated by the FFAS03 [11] threading server and was the best scoring alignment for target T0369 of the CASP7 experiment in the 3D-Jury [12] metaserver. The structural template identified by FFAS was 1rxq_A and we can assess whether the functional residues transfer between the target and the template.

>T0369
MTDWQQALDRHVGVGVRTTRDLIRLIQPEDWDKRPISGKRSVYEVAVHLAVLLEADLRIATG-------- 
-----------ATADEMAQFYAVPVLPEQLVDRLDQSWQYYQDRLMADFSTETTYWGVTDSTTGWLLEAA 
VHLYHHRSQLLDYLNLLGYDIKLDLFE
>1rxq_A
SKEQKDKWIQVLEEVPAKLKQAVEVMTDSQLDTPYRDGGWTVRQVVHHLADSHMNSYIRFKLSLTEETPA
IRPYDEKAWSELKDSKTADPSGSLALLQELHGRWTALLRTLTDQQFKRGFYHPD-TKEIITLENALGLYV 
WHSHHHIAHITELSRRMGWS-------

We paste the alignment in the text box, with the template PDB code: "1rxq" and the chain identifier: "A". And the firecat output is:

1
Query:T0369.........10 ........20..... ...30........40 ........50..... ...60. ..... ..70........80. .......90...... ..100.......110.. .....120.......130 .......140......
| | | | | | | | | | | | | |
Query:T0369MTDWQ QALDR HVGVG VRTTR DLIRL IQPED WDKRP ISGKR SVYEV AVHLA VLLEA DLRIA TG---------- ---------AT ADEMA QFYAV PVLPE QLVDR LDQSW QYYQD RLMAD FSTET TYWGV TDSTT GWLLE AAVHL YHHRS QLLDY LNLLG YDIKL DLFE
Template: 1rxqBSK EQKDK WIQVL EEVPA KLKQA VEVMT DSQLD TPYRD GGWTV RQVVH HLADS HMNSY IRFKL SLTEE TPAIR PYDEK AWSEL KDSKT ADPSG SLALL QELHG RWTAL LRTLT DQQFK RGFYH PD-TK EIIT LENA LGLY VWHS HHHI AHIT ELSR RMGW S-------
Score: ----------- -------------- ------211 ---1--132 23225 33----------- ----------------- ---------1----- ----112--2 1-------------- ---------------1 22---223 12352 211------- 21--------
100%NI -------------------- -------------------- -------H--------- ------------------E --------------------- --------------------- --------------------- --H---H--------- ------------

Based on this output, the nickel binding site can be reliably transferred from 1rxq_A to the target T0369. Note also that 1rxqA could not be obtained by standard PSI-BLAST searches.



Evolutionary relationships between sites

A binding site is deemed to be of biological relevance if it can be shown to have been conserved. The sites shown on this page are pre-calculated from data in FireDB, are highly conserved and are therefore highly likely to be be evolutionarily related.




Evolutionary relationships guide scheme



Reference:

[1] firestar -- advances in the prediction of functionally important residues Lopez G, Maietta P, Rodriguez JM, Valencia A, Tress ML.
Nucleic Acids Research, Volume 39, Issue suppl_2, 1 July 2011, Pages W235-W241;
DOI:10.1093/nar/gkr437

[2] firestar -- prediction of functionally important residues using structural templates and alignment reliability Lopez G, Valencia A, Tress ML.
Nucleic Acids Research, Volume 35, Issue suppl_2, 1 July 2007, Pages W573-W577;
DOI:10.1093/nar/gkm297

[3] Protein homology detection by HMM-HMM comparison Söding J.
Bioinformatics, Volume 21, Issue 7, 1 April 2005, Pages 951–960;
DOI:10.1093/bioinformatics/bti125

[4] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W and Lipman DJ.
Nucleic Acids Research, Volume 25, Issue 17, 1 September 1997, Pages 3389–3402;
DOI:10.1093/nar/25.17.3389

[5] FireDB--a database of functionally important residues from proteins of known structure. Lopez G, Valencia A and Tress ML.
Nucleic Acids Research, Volume 35, Issue suppl_1, 1 January 2007, Pages D219-D223;
DOI:10.1093/nar/gkl897

[6] Assessment of predictions submitted for the CASP7 function prediction category. Lopez G, Tress ML, Rojas AM, Valencia A.
Proteins 2007; 69(Suppl 8): pages 165–174.;
DOI:10.1002/prot.21651

[7] SQUARE--determining reliable regions in sequence alignments. Tress ML, Graña O, Valencia A.
Bioinformatics, Volume 20, Issue 6, 12 April 2004, Pages 974-975;
DOI:10.1093/bioinformatics/bth032

[8] The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural datas. Porter CT, Bartlett GJ, Thornton JM.
Nucleic Acids Research, Volume 32, Issue suppl_1, 1 January 2004, Pages D129-D133;
DOI:10.1093/nar/gkh028

[9] LGA: a method for finding 3D similarities in protein structures. Zemla A.
Nucleic Acids Research, Volume 31, Issue 13, 1 July 2003, Pages 3370-3374;
DOI:10.1093/nar/gkg571

[10] MUSCLE: a multiple sequence alignment method with reduced time and space complexity. Edgar RC.
BMC Bioinformatics. 2004 Aug 19;5:113;
DOI:10.1186/1471-2105-5-113

[11] FFAS03: a server for profile–profile sequence alignments. Jaroszewski L, Rychlewski L, Li Z, Li W & Godzik A.
Nucleic Acids Res. 2005 Jul 1; 33(Web Server issue): W284-W288;
DOI:10.1093/nar/gki418

[12] 3D-Jury: a simple approach to improve protein structure predictions. Ginalski K, Elofsson A, Fischer D, Rychlewski L.
Bioinformatics, Volume 19, Issue 8, 22 May 2003, Pages 1015–1018;
DOI:10.1093/bioinformatics/btg124


Technical support: dcerdan@cnio.es
Scientific support: mtress@cnio.es