Ferret
- 18 April 2006
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGOPS Operating Systems Review
- Vol. 40 (4) , 317-330
- https://doi.org/10.1145/1218063.1217966
Abstract
Building content-based search tools for feature-rich data has been a challenging problem because feature-rich data such as audio recordings, digital images, and sensor data are inherently noisy and high dimensional. Comparing noisy data requires comparisons based on similarity instead of exact matches, and thus searching for noisy data requires similarity search instead of exact search.The Ferret toolkit is designed to help system builders quickly construct content-based similarity search systems for feature-rich data types. The key component of the toolkit is a content-based similarity search engine for generic, multi-feature object representations. To solve the similarity search problem in high-dimensional spaces, we have developed approximation methods inspired by recent theoretical results on dimension reduction. The search engine constructs sketches from feature vectors as highly compact data structures for matching, filtering and ranking data objects. The toolkit also includes several other components to help system builders address search system infrastructure issues. We have implemented the toolkit and used it to successfully construct content-based similarity search systems for four data types: audio recordings, digital photos, 3D shape models and genomic microarray data.Keywords
This publication has 21 references indexed in Scilit:
- Three-dimensional shape searching: state-of-the-art review and future trendsComputer-Aided Design, 2005
- A Survey of Shape Similarity Assessment Algorithms for Product Design and Manufacturing ApplicationsJournal of Computing and Information Science in Engineering, 2003
- Musical genre classification of audio signalsIEEE Transactions on Speech and Audio Processing, 2002
- SIMPLIcity: semantics-sensitive integrated matching for picture librariesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2001
- Unsupervised segmentation of color-texture regions in images and videoPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2001
- Content-based image retrieval at the end of the early yearsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2000
- MARSYAS: a framework for audio analysisOrganised Sound, 2000
- Min-Wise Independent PermutationsJournal of Computer and System Sciences, 2000
- Efficient Search for Approximate Nearest Neighbor in High Dimensional SpacesSIAM Journal on Computing, 2000
- Image Retrieval: Current Techniques, Promising Directions, and Open IssuesJournal of Visual Communication and Image Representation, 1999