Analysis of expressed sequence tags indicates 35,000 human genes

Abstract
The number of protein-coding genes in an organism provides a useful first measure of its molecular complexity. Single-celled prokaryotes and eukaryotes typically have a few thousand genes; for example, Escherichia coli1 has 4,300 and Saccharomyces cerevisiae2 has 6,000. Evolution of multicellularity appears to have been accompanied by a several-fold increase in gene number, the invertebrates Caenorhabditis elegans3 and Drosophila melanogaster4 having 19,000 and 13,600 genes, respectively. Here we estimate the number of human genes by comparing a set of human expressed sequence tag (EST) contigs with human chromosome 22 and with a non-redundant set of mRNA sequences. The two comparisons give mutually consistent estimates of approximately 35,000 genes, substantially lower than most previous estimates. Evolution of the increased physiological complexity of vertebrates may therefore have depended more on the combinatorial diversification of regulatory networks or alternative splicing than on a substantial increase in gene number.