Autonomous visual model building based on image crawling through internet search engines
- 15 October 2004
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 315-322
- https://doi.org/10.1145/1026711.1026762
Abstract
In this paper, we propose an autonomous learning scheme to automatically build visual semantic concept models from the output data of Internet search engines without any manual labeling work. First of all, images are gathered by crawling through the Internet using a search engine such as Google. Then, we model the search results as "Quasi-Positive Bags" in the Multiple-Instance Learning (MIL) framework. We call this generalized MIL (GMIL). We propose an algorithm called "Bag K-Means" to find the maximum Diverse Density (DD) without the existence of negative bags. A cost function is found as K-Means with special "Bag Distance". We also propose a solution called "Uncertain Labeling Density" (ULD) which describes the target density distribution of instances in the case of quasi-positive bags. A "Bag Fuzzy K-Means" is presented to get the maximum of ULD. By this generalized MIL with ULD, the model for a particular concept is learned from the crawled images of the Internet search engines. Experiments show that our algorithm can get correct models for the concepts we are interested in. Compared to the original Google Image Search, our algorithm shows improved accuracy.Keywords
This publication has 8 references indexed in Scilit:
- Robust Real-Time Face DetectionInternational Journal of Computer Vision, 2004
- MPEG-7 video automatic labeling systemPublished by Association for Computing Machinery (ACM) ,2003
- Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object DetectionPublished by Springer Nature ,2003
- Relevance feedback: a power tool for interactive content-based image retrievalIEEE Transactions on Circuits and Systems for Video Technology, 1998
- Solving the multiple instance problem with axis-parallel rectanglesPublished by Elsevier ,1998
- Robust clustering methods: a unified viewIEEE Transactions on Fuzzy Systems, 1997
- Content-based representation and retrieval of visual media: A state-of-the-art reviewMultimedia Tools and Applications, 1996
- Improving retrieval performance by relevance feedbackJournal of the American Society for Information Science, 1990