Semiautomatic news analysis, indexing, and classification system based on topic preselection

Abstract
In this paper, we present the concept of an efficient semiautomatic system for analysis, classification and indexing of TV news program material, and show the feasibility of its practical realization. The only input into the system, other than the news program itself, are the spoken words, serving as keys for topic prespecification. The chosen topics express user's current professional or private interests and are used for filtering the news material correspondingly. After the basic analysis steps on a news program stream, including the processes of shot change detection and key frame extraction, the system automatically represents the news program as a series of longer higher-level segments. Each of them contains one or more video shots and belongs to one of the coarse categories, such as anchorperson (news reader) shots, news shot series, the starting and ending program sequence. The segmentation procedure is performed on the video component of the news program stream and the results are used to define the corresponding segments in the news audio stream. In the next step, the system uses the prespecified audio keys to index the segments and group them into reports, being the actual retrieval units. This step is performed on the segmented news audio stream by applying the wordspotting procedure to each segment. As a result, all the reports on prespecified topics are easily reachable for efficient retrieval.
Keywords

This publication has 0 references indexed in Scilit: