Characterization of new proteins found by analysis of short open reading frames from the full yeast genome

Abstract
We have analysed short open reading frames (between 150 and 300 base pairs long) of the yeast genome (Saccharomyces cerevisiae) with a two‐step strategy. The first step selects a candidate set of open reading frames from the DNA sequence based on statistical evaluation of DNA and protein sequence properties. The second step filters the candidate set by selecting open reading frames with high similarity to other known sequences (from any organism). As a result, we report ten new predicted proteins not present in the current sequence databases. These include a new alcohol dehydrogenase, a protein probably related to the cell cycle, as well as a homolog of the prokaryotic ribosomal protein L36 likely to be a mitochondrial ribosomal protein coded in the nuclear genome. We conclude that the analysis of short open reading frames leads to biologically interesting discoveries, even though the quantitative yield of new proteins is relatively low. © 1997 John Wiley & Sons, Ltd.