Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana

Abstract
Arabidopsis thaliana is an important model system for plant biologists1. In 1996 an international collaboration (the Arabidopsis Genome Initiative) was formed to sequence the whole genome of Arabidopsis2 and in 1999 the sequence of the first two chromosomes was reported3,4. The sequence of the last three chromosomes and an analysis of the whole genome are reported in this issue5,6,7. Here we present the sequence of chromosome 3, organized into four sequence segments (contigs). The two largest (13.5 and 9.2 Mb) correspond to the top (long) and the bottom (short) arms of chromosome 3, and the two small contigs are located in the genetically defined centromere8. This chromosome encodes 5,220 of the roughly 25,500 predicted protein-coding genes in the genome. About 20% of the predicted proteins have significant homology to proteins in eukaryotic genomes for which the complete sequence is available, pointing to important conserved cellular functions among eukaryotes.