Computer-Based Methods for the Mouse Full-Length cDNA Encyclopedia: Real-Time Sequence Clustering for Construction of a Nonredundant cDNA Library
Open Access
- 18 January 2001
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 11 (2) , 281-289
- https://doi.org/10.1101/gr.gr-1457r
Abstract
We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3′ ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5′ ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3′ ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3′ end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (seehttp://genome.gse.riken.go.jp/). [The sequence data described in this paper have been submitted to the DDBJ data library under accession nos. AV00011–AV175734, AV204013–AV382295, andBB561685–BB609425.]Keywords
This publication has 31 references indexed in Scilit:
- Comparative evaluation of 5′-end-sequence quality of clones in CAP trapper and other full-length-cDNA librariesGene, 2001
- Normalization and Subtraction of Cap-Trapper-Selected cDNAs to Prepare Full-Length cDNA Libraries for Rapid Discovery of New GenesGenome Research, 2000
- Frequent Alternative Splicing of Human GenesGenome Research, 1999
- d2_cluster: A Validated Method for Clustering EST and Full-Length cDNA SequencesGenome Research, 1999
- CAP3: A DNA Sequence Assembly ProgramGenome Research, 1999
- Expressed sequence tags and chromosomal localization of cDNA clones from a subtracted retinal pigment epithelium libraryGenomics, 1992
- Sequence identification of 2,375 human brain genesNature, 1992
- Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome ProjectScience, 1991
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, 1988
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970