Discriminative model fusion for semantic concept detection and annotation in video
- 2 November 2003
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 255-258
- https://doi.org/10.1145/957013.957065
Abstract
In this paper we describe a general information fusion algorithm that can be used to incorporate multimodal cues in building user-defined semantic concept models. We compare this technique with a Bayesian Network-based approach on a semantic concept detection task. Results indicate that this technique yields superior performance. We demonstrate this approach further by building classifiers of arbitrary concepts in a score space defined by a pre-deployed set of multimodal concepts. Results show annotation for user-defined concepts both in and outside the pre-deployed set is competitive with our best video-only models on the TREC Video 2002 corpus.Keywords
This publication has 4 references indexed in Scilit:
- User-trainable video annotation using multimodal cuesPublished by Association for Computing Machinery (ACM) ,2003
- Bayesian modeling of video editing and structure: semantic features for video summarization and browsingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Semantic visual templates: linking visual features to semanticsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- VisualSEEkPublished by Association for Computing Machinery (ACM) ,1996