Human-robot interaction through real-time auditory and visual multiple-talker tracking
- 13 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 3, 1402-1409 vol.3
- https://doi.org/10.1109/iros.2001.977177
Abstract
Nakadai et al. (2001) have developed a real-time auditory and visual multiple-talker tracking technique. In this paper, this technique is applied to human-robot interaction including a receptionist robot and a companion robot at a party. The system includes face identification, speech recognition, focus-of-attention control, and sensorimotor task in tracking multiple talkers. The system is implemented on a upper-torso humanoid and the talker tracking is attained by distributed processing on three nodes connected by 100Base-TX network. The delay of tracking is 200 msec. Focus-of-attention is controlled by associating auditory and visual streams by using the sound source direction and talker position as a clue. Once an association is established, the humanoid keeps its face to the direction of the associated talker.Keywords
This publication has 7 references indexed in Scilit:
- Designing a humanoid head for RoboCup challengePublished by Association for Computing Machinery (ACM) ,2000
- Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -Published by International Speech Communication Association ,1999
- Listening to two simultaneous speechesSpeech Communication, 1999
- The Cog Project: Building a Humanoid RobotPublished by Springer Nature ,1999
- Japanese Dictation Toolkit. 1997 version.Acoustical Science and Technology, 1999
- Building ears for robots: Sound localization and separationArtificial Life and Robotics, 1997
- Some Experiments on the Recognition of Speech, with One and with Two EarsThe Journal of the Acoustical Society of America, 1953