The Protein Information Resource: an integrated public resource of functional annotation of proteins
- 1 January 2002
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 30 (1) , 35-37
- https://doi.org/10.1093/nar/30.1.35
Abstract
The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases).Keywords
This publication has 15 references indexed in Scilit:
- iProClass: an integrated, comprehensive and annotated protein classification databaseNucleic Acids Research, 2001
- The RESID Database of protein structure modifications and the NRL-3D Sequence-Structure DatabaseNucleic Acids Research, 2001
- RefSeq and LocusLink: NCBI gene-centered resourcesNucleic Acids Research, 2001
- The Protein Data BankNucleic Acids Research, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Superfamily classification in PIR-international protein sequence databasePublished by Elsevier ,1996
- Maximum Discrimination Hidden Markov Models of Sequence ConsensusJournal of Computational Biology, 1995
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, 1988
- Comparison of biosequencesAdvances in Applied Mathematics, 1981