Human-robot interaction through real-time auditory and visual multiple-talker tracking

13 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 3, 1402-1409 vol.3
https://doi.org/10.1109/iros.2001.977177

Abstract

Nakadai et al. (2001) have developed a real-time auditory and visual multiple-talker tracking technique. In this paper, this technique is applied to human-robot interaction including a receptionist robot and a companion robot at a party. The system includes face identification, speech recognition, focus-of-attention control, and sensorimotor task in tracking multiple talkers. The system is implemented on a upper-torso humanoid and the talker tracking is attained by distributed processing on three nodes connected by 100Base-TX network. The delay of tracking is 200 msec. Focus-of-attention is controlled by associating auditory and visual streams by using the sound source direction and talker position as a clue. Once an association is established, the humanoid keeps its face to the direction of the associated talker.

Keywords

This publication has 7 references indexed in Scilit:

Designing a humanoid head for RoboCup challenge
Published by Association for Computing Machinery (ACM) ,2000
Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -
Published by International Speech Communication Association ,1999
Listening to two simultaneous speeches
Speech Communication, 1999
The Cog Project: Building a Humanoid Robot
Published by Springer Nature ,1999
Japanese Dictation Toolkit. 1997 version.
Acoustical Science and Technology, 1999
Building ears for robots: Sound localization and separation
Artificial Life and Robotics, 1997
Some Experiments on the Recognition of Speech, with One and with Two Ears
The Journal of the Acoustical Society of America, 1953