To the editor:
Blood coagulation proteins (BCPs) play a major role in hemostasis.1 Except for a few that have their own dedicated databases, information on most BCPs are scattered across various disparate data sources in multiple formats. This information has been compiled, manually curated, and assembled into a knowledgebase called ClotBase with the aim of accelerating clinical diagnosis and research in the area of coagulation disorders (Table 1). It presents up-to-date information on all aspects of BCPs ranging from sequence and structure information, source organisms, function, subcellular location, tissue specificity and related literature. Links to external databases such as PubMed, European Molecular Biology Laboratory, Protein Information Resource, Protein Data Bank, and Online Mendelian Inheritance in Man are also provided for retrieval of additional information on BCPs. The interactive search features permit easy retrieval of the information available in ClotBase.
Protein . | No. of sequences . | No. of patterns . | No. of mutations . |
---|---|---|---|
Factor I | |||
Alpha | 13 | 3 | 63 |
Beta | 14 | 7 | 35 |
Gamma | 12 | 6 | 74 |
Factor II | 22 | 3 | 64 |
Factor III | 14 | 3 | 0 |
Factor V | 14 | 8 | 143 |
Factor VII | 22 | 2 | 197 |
Factor VIII | 14 | 4 | 497 |
Factor IX | 19 | 3 | 384 |
Factor X | 18 | 3 | 115 |
Factor XI | 12 | 4 | 184 |
Factor XII | 10 | 5 | 22 |
Factor XIII | |||
Alpha | 6 | 5 | 92 |
Beta | 4 | 0 | 5 |
Protein C | 16 | 7 | 205 |
Protein S | 10 | 14 | 164 |
Protein Z | 7 | 8 | 6 |
Antithrombin III | 21 | 3 | 165 |
von Willebrand factor | 8 | 9 | 310 |
Kallikrein | 12 | 10 | 6 |
Kininogen | 8 | 5 | 1 |
Heparin cofactor II | 13 | 7 | 4 |
Plasminogen | 13 | 5 | 12 |
Tissue plasminogen activator | 11 | 4 | 0 |
Combined factor V & VIII deficiency | — | — | 48 |
Total | 313 | 128 | 2796 |
Protein . | No. of sequences . | No. of patterns . | No. of mutations . |
---|---|---|---|
Factor I | |||
Alpha | 13 | 3 | 63 |
Beta | 14 | 7 | 35 |
Gamma | 12 | 6 | 74 |
Factor II | 22 | 3 | 64 |
Factor III | 14 | 3 | 0 |
Factor V | 14 | 8 | 143 |
Factor VII | 22 | 2 | 197 |
Factor VIII | 14 | 4 | 497 |
Factor IX | 19 | 3 | 384 |
Factor X | 18 | 3 | 115 |
Factor XI | 12 | 4 | 184 |
Factor XII | 10 | 5 | 22 |
Factor XIII | |||
Alpha | 6 | 5 | 92 |
Beta | 4 | 0 | 5 |
Protein C | 16 | 7 | 205 |
Protein S | 10 | 14 | 164 |
Protein Z | 7 | 8 | 6 |
Antithrombin III | 21 | 3 | 165 |
von Willebrand factor | 8 | 9 | 310 |
Kallikrein | 12 | 10 | 6 |
Kininogen | 8 | 5 | 1 |
Heparin cofactor II | 13 | 7 | 4 |
Plasminogen | 13 | 5 | 12 |
Tissue plasminogen activator | 11 | 4 | 0 |
Combined factor V & VIII deficiency | — | — | 48 |
Total | 313 | 128 | 2796 |
ClotBase has information on 313 sequences, 128 patterns, and 2796 mutations for 21 blood coagulation proteins.
— indicates data not available.
The deficiency of BCPs leads to various diseases such as hemophilia, thrombosis and increased risk of myocardial infarction.2,–4 Identification of these disease-causing mutations in patients can help in genetic testing to confirm or rule out a suspected syndrome or help determine a person's chance of developing or passing on a genetic disorder. Presently, more than 2796 mutations have been identified in BCPs. These data have been compiled from various data sources and are presented in ClotBase as information on protein sequence, position of mutation, wild-type and mutant residues, domain involved, codon and exon/intron position, associated diseases, and relevant literature links.
Evolutionarily conserved residues are known to be crucial for maintaining the structural stability and function of the protein.5 The availability of vast sequence information on BCPs makes it ideal to explore data mining tools to identify their conserved residues. Consensus sequence represents the result of a multiple sequence alignment of homologs; wherein each position denotes the residues that are most abundant in the alignment. Thus, consensus sequence is a single sequence representation for a protein family. The extent of conservation and the possible residues that can be accommodated in a particular position without perturbing the structure and function of the protein can be obtained from the pattern information. Patterns, also known as motifs, signatures or fingerprints for a protein family, pose tight constraints during the evolution of these sequences. Consensus sequences and patterns present in BCPs were identified using various in silico tools. Users can access this information and also search for homologs in ClotBase. The query sequence can be searched for similarity with all or specific BCPs.
An important feature of ClotBase is the sequence-based Screen Mutation tool by which researchers/clinicians can detect lethal mutations in protein sequences based on reported literature or the evolutionarily conserved residues of BCPs that have been identified by our study.
ClotBase is currently the only open-access, manually curated database that stores information on all the known BCPs. It is designed in a user-friendly manner to allow easy and interactive navigation across its various interfaces. ClotBase aims to be a one-stop information portal for accessing manually curated data as well as submitting relevant data on BCPs. It will be updated every 4 months and can be freely accessed at http://www.clotbase.bicnirrh.res.in.
Authorship
This work was supported by grants from Indian Council of Medical Research (63/128/2001-BMS).
Contribution:. A.S. was involved in data collection; P.N. developed the database; R.S.B. created the Web interface; and S.I.-T. guided the work.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Susan Idicula-Thomas, Biomedical Informatics Center, National Institute for Research in Reproductive Health, J.M. St, Parel, Mumbai 400012, India; e-mail: thomass@nirrh.res.in.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal