firestar help page

firestar[1] makes predictions of functionally important residues using the large inventory of functionally important residues in the FireDB database [2]. The reliability of the transfer of functional information between the functionally important residues in FireDB and the query sequence is evaluated via the local residue conservation between the two sequences.


User Input

The user must unambiguously identify a protein sequence. Three options are allowed:
  • PDB code: each entry in the PDB is identified by a four letter code (e.g. "1tco"). PDB files usually contain more than one chain, so the user is prompted to give the chain identifier.
  • Uploading a PDB formatted coordinates file: Here again the user is prompted to choose the chain if more than one appears in the file.
  • Sequence input: At present we only accept FASTA or raw sequence formats. In this case some options will be disabled because no structure can be used for structural alignments.

Output

Target sequences are subjected to standard PSI-BLAST searches. Profiles are generated with an nrdb90 database from the EBI and the final search is made against the FireDB consensus sequence database derived from PDB sequences.

Consensus sequences are identified by id's of the kind t_1tcoC where the chain "C" of the pdb file "1tco" is the representative of the cluster. Consensus sequences are built from clusters of sequences with a similarity of 97% sequence identity.

E-values can be modified by the user. The default E-value cutoff is set to 10. We have found [3] that in some cases local similarities are reliable enough to transfer binding site residues between even distant homologues.

Only those PSI-BLAST hits with any functional information annotated in FireDB are displayed. Each hit is represented in its own box. All boxes are displayed on the same page.

Each alignment is evaluated using a specially adapted version of SQUARE [4]. This method generates scores for each position in the alignment based on local conservation scores.

SQUARE produces reliability scores for all aligned residues, but it is important to note that these reliability scores refer solely to the viability of the transference of functionally important residues from the consensus sequence. The score is meaningless for those residues in the FireDB consensus sequence that do not have associated functionally important residues.

The functionally important residues from the FireDB consensus sequences are supported by evidence. In the case of catalytic residues the evidence comes from the Catalytic Site Atlas [5] and are annotated as "literature" or "inferred with PSI-BLAST". In the case of binding residues a percentage of occurrence is generated. The following table explains evidence and reliability scores in more detail:

SQUARE: table of scores  Binding site occurrence and catalytic residue evidence
Gapped or non-conserved position literatureCatalytic residue annotated from literature
45% reliability PSI-blastCatalytic residue inferred by similarity
60% reliability 
75% reliability 0-20% occurrence
4 85% reliability 20-50% occurrence
5 90% reliability 50-100% occurrence
C 99% reliability XX% occurrenceNot applicable
Note that sites are usually formed by groups of residues. The user should assess overall conservation based on all the residues important for one site. It is common to find that binding sites are partially conserved.   Occurrence is calculated when a given consensus sequence has several representatives in the PDB. It indicates the percentage of PDB representatives that bind ligand analogs at same site. Where there are less than 5 representatives the values are marked "not applicable".


PSI-BLAST output guide scheme


Structural alignment output

Structural alignments can be generated by clicking on "run LGA" in the yellow box next to each PSI-BLAST output.

Structural superpositions between the query and the representative of the selected cluster are generated with LGA [6] and open in a separate page that has a similar look to the pairwise PSI-BLAST results.

The superpositions can be visualised within the page with a Jmol applet. Note that this means that Java environment must be installed for the navigator. If the java environment is not installed for the browser the applet may cause the navigator to crash on some platforms. Unfortunately this technical issue is beyond our control.


Structural alignment output guide scheme



Multiple sequence alignment output

Multiple alignments between the consensus sequences and the query sequence can be generated by clicking on the "Display Multiple Sequence" button in the pink box in the PSI-BLAST output page. MUSCLE [7] will align the query sequence to all the consensus sequences found by PSI-BLAST.
The MSA output page allows selection of a subset of sequences to be realigned by checking in the checkboxes.

Residue colour scheme

Multiple sequence alignment with catalytic residues highlighted:
firestar makes use of the supporting evidence collected by the Catalytic Site Atlas.

The Catalytic Site Atlas has two types of supporting evidence:
1) evidence from curated literature references and
2) evidence inferred from PSI-BLAST.
Multiple sequence alignment with binding residues highlighted:
Residues are calculated from PDB structures; given the redundancy of the PDB it is possible to assess whether a given set of residues are participating in the same protein-ligand binding in several highly similar (more than 97% identity) structures. In firestar this is represented by "occurrence" and is only applicable when a cluster has 5 or more representatives. More information at FireDB.
Literature evidence
PSI-BLAST evidence
Other functional residues
100% occurrence
66-99% occurrence
33-66% occurrence
< 33% occurrence
Not applicable
Other functional residues

Multiple sequence alignment output guide scheme



Firecat help page

Alignment reliability is most sensitive to alignment quality. This tool allows user to insert their own pairwise query-template alignments with the requirement that template must be a PDB sequence.

It requires the second sequence in the pairwise alignment to be a PDB sequence. In this way the server can identify the FireDB cluster to which the second sequence belongs and associate the functional information stored in FireDB.

This tool can be used when alignments can be improved, but also when firestar PSI-BLAST is unable to find homologues. In this case query-template alignments from fold recognition servers can be used.

example

The following alignment has been generated by the FFAS03 [8] threading server and was the best scoring alignment for target T0369 of the CASP7 experiment in the 3d-jury [9] metaserver. The structural template identified by FFAS was 1rxq_A and we can assess whether the functional residues transfer between the target and the template.

>T0369
MTDWQQALDRHVGVGVRTTRDLIRLIQPEDWDKRPISGKRSVYEVAVHLAVLLEADLRIATG-------- 
-----------ATADEMAQFYAVPVLPEQLVDRLDQSWQYYQDRLMADFSTETTYWGVTDSTTGWLLEAA 
VHLYHHRSQLLDYLNLLGYDIKLDLFE
>1rxq_A
SKEQKDKWIQVLEEVPAKLKQAVEVMTDSQLDTPYRDGGWTVRQVVHHLADSHMNSYIRFKLSLTEETPA
IRPYDEKAWSELKDSKTADPSGSLALLQELHGRWTALLRTLTDQQFKRGFYHPD-TKEIITLENALGLYV 
WHSHHHIAHITELSRRMGWS-------

We paste the alignment in the text box, with the template PDB code: "1rxq" and the chain identifier: "A". And the firecat output is:

1	
Query:T0369.........10 ........20..... ...30........40 ........50..... ...60. ..... ..70........80. .......90...... ..100.......110.. .....120.......130 .......140......
| | | | | | | | | | | | | |
Query:T0369MTDWQ QALDR HVGVG VRTTR DLIRL IQPED WDKRP ISGKR SVYEV AVHLA VLLEA DLRIA TG---------- ---------AT ADEMA QFYAV PVLPE QLVDR LDQSW QYYQD RLMAD FSTET TYWGV TDSTT GWLLE AAVHL YHHRS QLLDY LNLLG YDIKL DLFE
Template: 1rxqBSK EQKDK WIQVL EEVPA KLKQA VEVMT DSQLD TPYRD GGWTV RQVVH HLADS HMNSY IRFKL SLTEE TPAIR PYDEK AWSEL KDSKT ADPSG SLALL QELHG RWTAL LRTLT DQQFK RGFYH PD-TK EIIT LENA LGLY VWHS HHHI AHIT ELSR RMGW S-------
Score: ----------- -------------- ------211 ---1--132 23225 33----------- ----------------- ---------1----- ----112--2 1-------------- ---------------1 22---223 12352 211------- 21--------
100%NI -------------------- -------------------- -------H--------- ------------------E --------------------- --------------------- --------------------- --H---H--------- ------------

Based on this output, the nickel binding site can be reliably transferred from 1rxq_A to the target T0369. Note also that 1rxqA could not be obtained by standard PSI-Blast searches.



Evolutionary relationships between sites

A binding site is deemed to be of biological relevance if it can be shown to have been conserved. The sites shown on this page are pre-calculated from data in FireDB, are highly conserved and are therefore highly likely to be be evolutionarily related.

Evolutionary relationships guide scheme


References:
  • 1. Lopez,G, Valencia,A, Tress,ML. (2007) firestar--prediction of functionally important residues using structural templates an alignment reliability. Nucleic Acids Research, doi:10.1093/nar/gkm297
  • 2. Lopez,G, Valencia,A, Tress,ML. (2007) FireDB--a database of functionally important residues from proteins of known structure.Nucleic Acids Research doi: 10.1093/nar/gkl897
  • 3. Lopez,G, Tress,ML, Rojas,AM, Valencia,A. (2007) CASP7 function evaluation. Proteins CASP Special Issue, in press.
  • 4. Tress,ML, Graña,O, Valencia,A. (2004) SQUARE-determining reliable regions in sequence alignments, Bioinformatics, 20, 974-975.
  • 5. Porter,CT, Bartlett,GJ, Thornton,JM. (2004) The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res., 32: D129-D133.
  • 6. Zemla,A. (2003) LGA: A method for finding 3D similarities in protein structures.Nucleic Acids Res. 31:3370-3374.
  • 7. Edgar,RC. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5: 113.
  • 8. Jaroszewski, L., Rychlewski, L., Li, Z., Li, W. & Godzik, A. (2005) FFAS03: a server for profile-profile sequence alignments. Nucl. Acids Res. 33, W284-W288
  • 9. Ginalski K, Elofsson A, Fischer D, Rychlewski L. "3D-Jury: a simple approach to improve protein structure predictions." Bioinformatics. 2003 May 22;19(8):1015-8.