Genomically linked cellular protein databases derived from two‐dimensional polyacrylamide gel electrophoresis

Abstract
In its most useful form a cellular protein database should be genomically based, because it is the genome which determines both the total number of proteins a cell can make and the particular ones that will be made under any given condition. Such a database should trace each protein back to its structural gene, and should account for every structural gene of a cell. Recent advances in molecular biology greatly facilitate the construction of such gene‐protein databases. The mapping of genes of unidentified proteins resolved from total cell extracts on two‐dimensional gels can now be accomplished by largely biochemical methods, without the necessity of isolating mutants or performing genetic crosses. Other techniques permit one to search gels for the product of any newly discovered gene (or open reading frame) suspected of encoding a protein. Consequently, gene‐protein indices can be built independently and simultaneously from either direction – deducing the genetic map from the protein pattern, or finding the protein pattern from information encoded in the genome. A database of this sort is being constructed for the bacterium, Escherichia coli. Given the current pace of DNA nucleotide sequencing, the development of total gene‐protein indices for a variety of cells can be anticipated in the near future.