Toru Takahashi, Taiki Kanbayashi, Masato Nakayama
Electronics, 14(4) 711-711, Feb 12, 2025 Peer-reviewedLast authorCorresponding author
If we can understand dialogue activities, it will be possible to know the role of each person in the discussion, and it will be possible to provide basic materials for formulating facilitation strategies. This understanding can be expected to be used for business negotiations, group work, active learning, etc. To develop a system that can monitor speech activity over a wide range of areas, we propose a method for detecting multiple acoustic events and localizing sound sources using an asynchronous distributed microphone array arranged in a regular hexagonal repeating structure. In contrast to conventional methods based on sound source direction using triangulation with microphone arrays, we propose a method for detecting acoustic events and determining sound sources from local maximum positions based on estimation of the spatial energy distribution inside the observation space. We evaluated the conventional method and the proposed method in an experimental environment in which a dialogue between two people was simulated under 22,104 conditions by using the sound source signal convolving the measured impulse response.We found that the performance changes depending on the selection of the microphone array used for estimation. Our finding is that it is best to choose five microphone arrays close to the evaluation position.