Using audio time scale modification for video browsing
- 25 August 2005
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
In the IBM CueVideo project we study various aspects of fully automated video indexing, browsing and retrieval. The technical aspects include audio processing, speech recognition, image processing and information retrieval. Equally important, however, is exploring user expectations and conducting user studies. We focus on the field of video for Training and Education, including Distributed Learning, Remote Education, and Just-in-Time Learning. This paper describes the use of audio processing technology, namely audio Time Scale Modification (TSM), for the novel application of fast video browsing and efficient video-based learning. The paper provides a brief overview of the CueVideo system, technical background of TSM technology, and the way it is being used in our system. The results of our usability study on the effect of TSM on speech comprehension indicate that TSM is very useful for fast video browsing.Keywords
This publication has 18 references indexed in Scilit:
- Speech recognition in the Informedia Digital Video Library: uses and limitationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal taskPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Lessons learned from building a terabyte digital video libraryComputer, 1999
- Next-generation content representation, creation, and searching for new-media applications in educationProceedings of the IEEE, 1998
- Video query: Research directionsIBM Journal of Research and Development, 1998
- An intelligent media browser using automatic multimodal analysisPublished by Association for Computing Machinery (ACM) ,1998
- VideoQPublished by Association for Computing Machinery (ACM) ,1997
- Virage image search engine: an open framework for image managementPublished by SPIE-Intl Soc Optical Eng ,1996
- Query by image and video content: the QBIC systemComputer, 1995
- Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signalsIEEE Transactions on Acoustics, Speech, and Signal Processing, 1979