MagicMatch--cross-referencing sequence identifiers across databases
Open Access
- 16 June 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (16) , 3429-3430
- https://doi.org/10.1093/bioinformatics/bti548
Abstract
Motivation: At present, mapping of sequence identifiers across databases is a daunting, time-consuming and computationally expensive process, usually achieved by sequence similarity searches with strict threshold values. Summary: We present a rapid and efficient method to map sequence identifiers across databases. The method uses the MD5 checksum algorithm for message integrity to generate sequence fingerprints and uses these fingerprints as hash strings to map sequences across databases. The program, called MagicMatch, is able to cross-link any of the major sequence databases within a few seconds on a modest desktop computer. Availability: MagicMatch is available at the following URL (http://cgg.ebi.ac.uk/services/magicmatch/), including an interactive service for major databases and binary downloads for widely used platforms. Contact:ouzounis@ebi.ac.ukKeywords
This publication has 8 references indexed in Scilit:
- A highly sensitive selection method for directed evolution of homing endonucleasesNucleic Acids Research, 2005
- Database resources of the National Center for Biotechnology InformationNucleic Acids Research, 2004
- COmplete GENome Tracking (COGENT): a flexible data environment for computational genomicsBioinformatics, 2003
- Clustering of highly homologous sequences to reduce the size of large protein databasesBioinformatics, 2001
- Removing near-neighbour redundancy from large protein sequence collections.Bioinformatics, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- SRS—an indexing and retrieval tool for flat file data librariesBioinformatics, 1993
- [5] Rapid and sensitive sequence comparison with FASTP and FASTAPublished by Elsevier ,1990