Assessing the protease and protease inhibitor content of the human genome

1 September 2000

journal article
research article
Published by Wiley in Journal of Peptide Science

Vol. 6 (9) , 453-458
https://doi.org/10.1002/1099-1387(200009)6:9<453::aid-psc284>3.0.co;2-z

Abstract

The revealing of the entire complement of protease and protease inhibitor sequences by the Human Genome Project will be of great importance to both academic and pharmaceutical research. Although the finishing phase is not yet complete, a selection of secondary annotation sources and comparisons with completed model organism genomes already allow useful estimates to be made. Conservative extrapolation suggests a total of approximately 1.8% for human proteases. This is close to the figures for yeast (1.7%) and worm (1.8%) but lower than the fly (3.4%) which has a large trypsin-like protease content. Using estimates for the human proteome of between 40,000 and 60,000 genes would extrapolate to 700-1,100 proteases, compared with approximately 360 currently represented as GenBank mRNAs. Preliminary comparisons between domain annotations for predicted human gene products and completed proteins suggest the genomic protease family and mechanistic class distributions will broadly reflect those in the current transcript data. The protease:inhibitor ratio at the mRNA level is currently approximately 9:1, but genome annotation data indicate that inhibitory domains are more widespread than this ratio would indicate.

Keywords

This publication has 11 references indexed in Scilit:

How to count…human genes
Nature Genetics, 2000
Comparative Genomics of the Eukaryotes
Science, 2000
The serine protease inhibitor canonical loop conformation: examples found in extracellular hydrolases, toxins, cytokines and viral proteins
Journal of Molecular Biology, 2000
Protease Inhibitors: Current Status and Future Prospects
Journal of Medicinal Chemistry, 2000
The Impact of Genomics on Drug Discovery
Progress in Medicinal Chemistry, 2000
The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
Nucleic Acids Research, 2000
MEROPS: the peptidase database
Nucleic Acids Research, 2000
GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support.
Bioinformatics, 1998
Gen(om)e duplications in the evolution of early vertebrates
Current Opinion in Genetics & Development, 1996
Regulation and Regulatory Role of Proteinase Inhibitors
Critical Reviews™ in Eukaryotic Gene Expression, 1995