FireDB: a database of annotated functionally important residues

FireDB is a database of small ligands and small ligand binding residues participating in a functional site.
The database can be accessed by PDB codes or UniProt accession numbers.


The sources of functional residues are protein-ligand atom contacts and the Catalytic Site Atlas. PDB chains are clustered at 97% sequence identity and every chain is mapped onto a consensus sequence from the cluster. Binding sites are brought together within each cluster and the individual binding sites are collapsed into "master sequence binding sites" (MBS). Important positions from the MBS are mapped to the consensus sequences. Comparison of binding sites within a cluster of sequences gives an idea of whcih residues are most important for binding, the flexibility of ligands in the binding sites and the capability that each site has to bind different ligand analogs.



The database can be accessed in 3 different ways:

  • The first one is via the web; you can query using a PDB code, a Uniprot primary accession number or a keyword; A keyword search will generate a detailed list of possible accession entries. In Figure 1,2,3 you have an output example for the PDB 1TCO chain: C
  • Figure 1. web page output example for query 1tcoC: general protein and collapsed binding site (MBS) information is shown;



    Figure 2. Expanded site information generated by clicking on "~RAP" link in the example in Figure 1

    Figure 3. Evolutionary related sites for MBS "~RAP" generated by clicking on the "E=16" link in the example in Figure 1




    Figure 4. Output example for keyword search "cyclohexyl"
    Ligand search available here






  • The entire MySQL database is freely available. The last release can be downloaded here in mysqldump format.
    The initial database schema has been changed. As you can see in Figure 5 FireDB is now constituted by 15 tables: there are 2 tables where information about protein sequences is stored (INFOACC and CONSENSUS) and one table dedicated to general ligand information (COMPOUND). Then contact information is calculated (SITE35) and collapsed on the master sequences (CSITE35) to generate the MBS. After that the MBS are evaluated and compared (CCTEVAL_35 and COMPARE35). MATCHING and PHARMA_ANNO store the information automatically retrieved from external databases while MANUAL stores the manual annotations. References are stored in the REFER table. For a detailed description of the information stored in each table, please check this page. Older releases are also available here.

    Figure 5 Actual MySQL database schema of FireDB





  • The database is finally accessible through RESTful web services. Here a model script is provided with commented example in order to facilitate the use.

    LIGAND

    Information that can be retrieved:
    • whole web page information
    • search of a keyword in the name/synonim of the compound
    • global compounds list
    • cognate compounds list
    • ambiguous compounds list
    • non cognate compound list
    • all manual annotations
    • all pharma annotations
    • all cross-references
    • all metal compounds
    • all obsolete entries (and their substitutes)


    PROTEIN

    Information that can be retrieved:
    • CSA and binding sites information
    • cluster information
    • collapsed binding sites (MBS) information
    • Evolutively related sites information



References:
1. Maietta P, Lopez G, Carro A, Pingilley BJ, Leon LG, Valencia A, Tress ML. (2014) "FireDB: a compendium of biological and pharmacologically relevant ligands."
Nucleic Acids Research, doi: 10.1093/nar/gkt1127

2. Lopez G, Valencia A, Tress ML. (2007) "FireDB--a database of functionally important residues from proteins of known structure"
Nucleic Acids Research, doi: 10.1093/nar/gkl897
3. Lopez G, Maietta P, Rodriguez JM, Valencia A and Tress ML. (2011) "firestar--advances in the prediction of functionally important residues"
Nucleic Acids Research. doi: 10.1093/nar/gkr437
4. Lopez G, Valencia A, Tress ML. (2007) "firestar--prediction of functionally important residues using structural templates an alignment reliability"
Nucleic Acids Research, doi:10.1093/nar/gkm297
5. Tress ML, Graña O, Valencia A. (2004) "SQUARE-determining reliable regions in sequence alignments"
Bioinformatics, doi: 10.1093/bioinformatics/bth032
6. Porter CT, Bartlett GJ, Thornton JM. (2004) "The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data"
Nucleic Acids Research, doi: 10.1093/nar/gkh028