Two-Microphone Voice Activity Detection Based on the Homogeneity of the Direction of Arrival Estimates

Abstract
Voice activity detection (VAD) systems have been the object of continuous research during the last three decades. While single microphone systems cannot take advantage of certain spatial properties of speech signals, microphone array systems consisting of many elements based on beamforming techniques can be difficult to implement in reality due to cost and complexity issues. The aim of the work described in this paper was to achieve both practical feasibility and spatial discrimination ability. A new approach is developed for two-microphone VAD capable of profiting from the concentration of speech energy in time, frequency and space. The algorithm is implemented and compared with several standard VAD algorithms, such as AFE, AMR and G.729B, and other recently proposed systems, revealing promising results under real-world noise conditions. The main advantage of the proposed approach is its capacity to outperform the above methods without the need for any spatial or spectral constraints, which makes it both versatile and capable of further improvement.

This publication has 6 references indexed in Scilit: