Abstract
The determination of the complete genome sequences of organisms is producing an avalanche of protein sequences awaiting further structural and functional interpretation. Only a small fraction of the proteins encoded in these genomes has been experimentally studied, but putative functions for roughly 70% of the ORFs can be assigned via homology with characterized proteins in the databases. Similarly, although only a very small number of structures have been determined for these proteins, putative three-dimensional (3D) structures can currently be assigned to roughly 30% of the ORFs using fold assignment computational methods. Here I address the following questions. How fast is our structural knowledge growing? What is the distribution of assigned folds in the different functional categories? How might structure determination efforts be prioritized for maximum information and impact?