Curriculum Vitaes

IWAI KENTA

  (岩居 健太)

Profile Information

Affiliation
Associate Professor, Faculty of Information Design Technology, Osaka Sangyo University
Guest Researcher, Organization for Research and Development of Innovative Science and Technology, Kansai University
Degree
Ph. D. (Eng.)(Kansai University)

J-GLOBAL ID
201801004052332030
researchmap Member ID
7000024135

External link

Papers

 113
  • 岩居 健太
    電子情報通信学会技術研究報告, 124(390) 275-280, Mar, 2025  Lead authorLast authorCorresponding author
  • 山口 翔, 豊岡 祥太, 岩居 健太, 喜多 俊輔, 梶川 嘉延
    電子情報通信学会技術研究報告, 124(389) 481-485, Mar, 2025  
  • 塚原 大樹, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告, 124(390) 269-274, Mar, 2025  
  • 松浦 亮, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告, 124(389) 180-185, Mar, 2025  
  • 水谷 真絃, 岩居 健太, 西浦 敬信, 添田 喜治
    電子情報通信学会技術研究報告, 124(389) 146-151, Mar, 2025  
  • Binh Thien Nguyen, Yukoh Wakabayashi, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Acoustical Science and Technology, 46(2) 186-190, Mar 1, 2025  Peer-reviewed
  • Shota Toyooka, Kenta Iwai, Yoshinobu Kajikawa
    Acoustical Science and Technology, 2025  Peer-reviewed
  • 塚原 大樹, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告, 124(318) 50-56, Dec, 2024  
  • Kenta Iwai, Takanobu Nishiura
    2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1-5, Dec, 2024  Peer-reviewedLead authorCorresponding author
  • Hayata Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1-6, Dec, 2024  Peer-reviewed
  • Yuki Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1-5, Dec, 2024  Peer-reviewed
  • Maoto Mizutani, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    2024 IEEE 13th Global Conference on Consumer Electronics (GCCE), 415-418, Oct 29, 2024  Peer-reviewed
  • Peng Chen, Binh Thien Nguyen, Kenta Iwai, Takanobu Nishiura
    Information, 15(10) 608-608, Oct 4, 2024  Peer-reviewed
    An effective approach to addressing the speech separation problem is utilizing a time–frequency (T-F) mask. The ideal binary mask (IBM) and ideal ratio mask (IRM) have long been widely used to separate speech signals. However, the IBM is better at improving speech intelligibility, while the IRM is better at improving speech quality. To leverage their respective strengths and overcome weaknesses, we propose an ideal threshold-based mask (ITM) to combine these two masks. By adjusting two thresholds, these two masks are combined to jointly act on speech separation. We list the impact of using different threshold combinations on speech separation performance under ideal conditions and discuss a reasonable range for fine tuning the thresholds. By using masks as a training target, to evaluate the effectiveness of the proposed method, we conducted supervised speech separation experiments applying a deep neural network (DNN) and long short-term memory (LSTM), the results of which were measured by three objective indicators: the signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifact ratio improvement (SAR). Experimental results show that the proposed mask combines the strengths of the IBM and IRM and implies that the accuracy of speech separation can potentially be further improved by effectively leveraging the advantages of different masks.
  • Subaru Kato, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    2024 IEEE 13th Global Conference on Consumer Electronics (GCCE), 411-414, Oct, 2024  Peer-reviewed
  • Yuki NAKANO, Kazumichi MIYAZATO, Yuting GENG, Kenta IWAI, Takanobu NISHIURA
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 270(7) 4484-4493, Oct, 2024  Peer-reviewed
    Optical laser microphones have attracted attention for acoustic systems capable of recording the target speech from a distance. An optical laser microphone measures the speech-induced vibration by focusing the laser beam on the surface of the vibrating object. A recording method called rough-focus recording using an unfocused laser beam, enables wide-range recording and robust recording against changes in focal length. However, with rough-focus recording, the broad laser beam coverage leads to insufficient intensity of the reflected laser for accurate acoustical signal measurement, resulting in speech-quality degradation, such as the inclusion of noise, and attenuation of high-frequency components in the acquired speech. To solve this problem, deep-learning-based speech-enhancement methods for optical laser microphones have been proposed. Such methods require separate models for different focus settings, exhibiting a lack of adaptability to changing focus settings. We propose a speech-enhancement method for training a single model for various focus settings. This model is trained with speech signals recorded across different focus settings to enhance speech recorded in various focus settings. Experimental results indicate that the this model trained with the proposed method performs equivalent to or better than a model trained with the conventional models.
  • Maoto MIZUTANI, Kenta IWAI, Takanobu NISHIURA, Yoshiharu SOETA
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 270(7) 4516-4525, Oct, 2024  Peer-reviewed
    In the active noise control (ANC), it is known that the optimal filter becomes a noncausal filter due to the effect of the nonminimum phase components of the secondary path. We have investigated how to make the causal optimal filter by replacing the nonminimum phase components with the frequency components of the primary path. Through the computer simulations, we have confirmed that this filter maintained some noise reduction performance and can be used as a quasi-optimal filter. In this paper, as another approach, we attempt to identify noncausal components using the pre-trained adaptive noise control filter. The nonminimum phase components of the secondary path exist in the frequency band below the lowest resonance frequency of the secondary loudspeaker. Based on this fact, we design a noise control filter by using band-limited white noise only with the low frequency components below the lowest resonance frequency in advance. Then, we design another adaptive noise control filter by using the band-limited white noise with high frequency components above the lowest resonance frequency. Finally, we combine these filters in the frequency domain. Through the computer simulations, we show the difficulty to handle the nonminimum phase components in the ANC system.
  • Peng Chen, Binh Thien Nguyen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    IEEE Access, 12 152036-152044, Oct, 2024  Peer-reviewed
  • 松浦 亮, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告, 124(94) 23-28, Jun, 2024  
  • 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告, 123(402) 294-299, Mar, 2024  Lead authorCorresponding author
  • Peng Chen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Proceedings of International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2024), 43-46, Mar, 2024  Peer-reviewed
  • Yanqiao Yan, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Proceedings of International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2024), 47-50, Mar, 2024  Peer-reviewed
  • Kenta Iwai, Takanobu Nishiura
    APSIPA Transactions on Signal and Information Processing, 12(1), Dec, 2023  Peer-reviewedLead authorCorresponding author
  • Kenta Iwai, Takanobu Nishiura
    2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1150-1154, Oct, 2023  Peer-reviewedLead authorCorresponding author
  • Shota Naiki, Shumpei Miura, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1221-1225, Oct, 2023  Peer-reviewed
  • Hayata Nakano, Tsubasa Yoshizawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2320-2325, Oct, 2023  Peer-reviewed
  • Nguyen Binh Thien, Yukoh Wakabayashi, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    INTERSPEECH 2023, Aug 20, 2023  
  • 岩居健太
    電子情報通信学会技術研究報告, 123(152) 71-76, Aug, 2023  InvitedLead authorLast authorCorresponding author
  • Tianyu Xie, Shota Toyooka, Kenta Iwai, Yoshinobu Kajikawa
    Proceedings of International Workshop on Smart Info-Media Systems in Asia, 44-49, Aug, 2023  Peer-reviewed
  • Kohei Izawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 268(6) 2456-2464, Aug, 2023  Peer-reviewed
    Cymbals are musical instruments consisting of round brass plates in slightly concave shape that are struck against another or struck with a stick to make sounds. Their vibration characteristics depend on the shape of cymbals, therefore, some cymbals have been specially shaped for desired characteristics, as called effect cymbals. Usually, it takes the craftsman numerous retrials in prototyping the effect cymbals to realize the desired characteristics. This leads to a large cost of prototyping effect cymbals. On the other hand, the cost can be significantly reduced if the optimal shape for the desired sound can be estimated in advance. However, to estimate the shape, it is necessary to clarify the relationship between the shape of a certain cymbal and its vibration characteristics. In this paper, we focus on the effect cymbals with holes. Aiming to estimate the optimal hole shape for desired sound, we investigate the relationship between the hole shape of hole effect cymbals and their vibration characteristics based on frequency response analysis by finite element method. We analyzed multiple models with different hole sizes for each cymbal, the analysis results show that the vibration tends to be simpler as the hole size is larger.
  • Shota Naiki, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 268(6) 2465-2476, Aug, 2023  Peer-reviewed
    In this paper, we propose a feedforward active noise control (ANC) system with an optical laser microphone utilizing a proportional-integral-differential (PID) filter. The feedforward ANC system with the optical laser microphone has been proposed to relax the causality constraint. This system adopts a first-order differentiator to modify the velocity picked up by the optical laser microphone. This modification achieves the improvement of the coherence between the velocity as the reference signal and the unwanted noise as the sound pressure. However, the frequency response of the first-order differentiator is similar to the high-pass filter and attenuates the reference signal at low frequencies. This causes the degradation of the noise reduction performance of the conventional ANC system. To solve this problem, the proposed ANC system adopts the PID filter instead of the first-order differentiator. By adjusting the coefficients of the PID filter, the proposed system avoids the attenuation of the reference signal at low frequencies. Then, the noise reduction performance can be improved. Experimental results show that the proposed ANC system reduces the unwanted noise compared to the conventional ANC system and relaxes the causality constraint compared to the basic ANC system in the real-time operation.
  • Hayata Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 268(6) 2444-2455, Aug, 2023  Peer-reviewed
    Recent studies have been proposed to extract speech signals from captured videos of objects vibrating by sound waves. Among them, a method for extracting speech signals from videos captured by a rolling-shutter camera, which is widely used, has been attracting attention. A rolling-shutter camera records image data in one row of pixels at a time, thereby capturing the vibration of objects caused by sound waves. However, there are time intervals between frames of the videos, resulting in missing segments in the extracted speech signals. The conventional method uses an autoregressive model to interpolate these missing segments. However, the conventional method ignores the noise in the extracted speech signals, and therefore the noise remains. In this paper, we propose a method to interpolate missing segments based on singular spectrum analysis, which considers the noise to further improve speech quality, by dual rolling-shutter cameras. By using the singular spectral analysis, the missing segments can be determined using only the speech components in the signals, which are related to large singular value, thereby reducing the noise. Experimental results show that the proposed method outperforms the conventional methods in terms of quality and intelligibility of the extracted speech signals.
  • Chengkai CAI, Kenta IWAI, Takanobu NISHIURA
    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E106.A(4) 647-656, Apr 1, 2023  Peer-reviewed
  • 中野 隼汰, 耿 毓庭, 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告, 122(387), Mar, 2023  
  • 内木 正太, 岩居 健太, 西浦 敬信, 添田 喜治
    電子情報通信学会技術研究報告, 122(387), Mar, 2023  
  • 井澤 幸平, 耿 毓庭, 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告, 122(387), Mar, 2023  
  • 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告, 122(388) 67-72, Mar, 2023  Lead authorCorresponding author
  • Peng Chen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    NCSP'23, 102-105, Mar, 2023  Peer-reviewed
  • Chengkai Cai, Kenta Iwai, Takanobu Nishiura
    Applied Sciences, 13(3) 1958-1958, Feb 2, 2023  Peer-reviewed
    The development of distant-talk measurement systems has been attracting attention since they can be applied to many situations such as security and disaster relief. One such system that uses a device called a laser Doppler vibrometer (LDV) to acquire sound by measuring an object’s vibration caused by the sound source has been proposed. Different from traditional microphones, an LDV can pick up the target sound from a distance even in a noisy environment. However, the acquired sounds are greatly distorted due to the object’s shape and frequency response. Due to the particularity of the degradation of observed speech, conventional methods cannot be effectively applied to LDVs. We propose two speech enhancement methods that are based on two-stage processing with deep neural networks for LDVs. With the first proposed method, the amplitude spectrum of the observed speech is first restored. The phase difference between the observed and clean speech is then estimated using the restored amplitude spectrum. With the other proposed method, the low-frequency components of the observed speech are first restored. The high-frequency components are then estimated by the restored low-frequency components. The evaluation results indicate that they improved the observed speech in sound quality, deterioration degree, and intelligibility.
  • Nguyen Binh Thien, Yukoh Wakabayashi, Kenta Iwai, Takanobu Nishiura
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31 1667-1680, 2023  
  • Takumi Miyake, Kenta Iwai, Yoshinobu Kajikawa
    IEEE Access, 11 6935-6943, 2023  Peer-reviewedCorresponding author
  • Nguyen Binh Thien, Yukoh Wakabayashi, Geng Yuting, Kenta Iwai, Takanobu Nishiura
    2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 958-962, Nov 7, 2022  Peer-reviewed
  • Kenta Iwai, Takanobu Nishiura
    2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 281-285, Nov 7, 2022  Peer-reviewedLead authorCorresponding author
  • Yanqiao Yan, Binh Thien Nguyen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022, 1673-1677, Nov, 2022  Peer-reviewed
    Audio super-resolution (ASR) is a complicated task for generating a high-resolution audio signal from a low-resolution signal. To solve this problem, we propose an ASR system for music signals that involves using deep neural networks in the time-frequency domain. The system has two components: a Wasserstein generative adversarial network-based high frequency magnitude generation model and a fully connected network-based corresponding high frequency band phase estimation model. The conventional high frequency band phase estimation methods require large computational complexity, have slow convergence, and reconstruct low quality high-resolution signals. We compare our proposed high frequency band phase estimation model in the ASR system with conventional phase estimation methods. The results show that our proposed phase estimation model outperforms conventional methods in objective evaluations.
  • Yuya Nakahira, Kenta Iwai, Yoshinobu Kajikawa
    Applied Sciences, 12(21) 10710-10710, Oct 22, 2022  Peer-reviewed
    Nonlinear distortion in loudspeaker systems degrades sound quality and must be properly compensated for by linearization techniques. One technique to reduce nonlinear distortion is to use a Volterra Filter, which approximates the nonlinearity of the target loudspeaker using the Volterra series expansion. In general, the Volterra Filter is computationally very expensive, and the amount of computation needs to be reduced for real-time processing. In this paper, we propose an efficient implementation of the third-order Volterra filter based on singular value decomposition. The proposed method determines the necessary coefficients based on the symmetry of the third-order Volterra filter and applies singular value decomposition to them. In the filter structure consisting of singular values and their corresponding singular vector, the computational complexity of the third-order Volterra filter can be reduced by eliminating the part of the filter with small singular values. By focusing on the magnitude of the singular values, the proposed method can improve the computational efficiency of the third-order Volterra filter without decreasing its approximation accuracy. Simulation results show that the proposed method can improve the computational efficiency by 60% while maintaining the nonlinear distortion compensation performance of the micro-speaker for smartphones by about 8 dB.
  • Koki Nakamura, Kenta Iwai, Takanobu Nishiura
    International Congress on Acoustics 2022, Oct, 2022  Peer-reviewed
  • Tsubasa Yoshizawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Acoustics 2022, Oct, 2022  Peer-reviewed
  • Yuna Harada, Yuting Geng, Kenta Iwai, Masato Nakayama, Takanobu Nishiura
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 265(4) 3464-3471, Aug, 2022  Peer-reviewed
    3-D sound field reproduction systems can provide a high presence. These systems commonly use electro-dynamic loudspeakers. Electro-dynamic loudspeakers tend to construct diffuse sound images due to its wide directivity and high reverberation. In contrast, parametric array loudspeakers can construct sharp sound images due to its sharp directivity and low reverberation. On the other hand, it is difficult to provide reverberation presence by parametric array loudspeakers because of the sharp directivity. We have previously proposed a sharp sound image construction based on reverberation control with surround sound system using parametric and electro-dynamic loudspeakers. In this method, the sharp sound image is rendered using parametric array loudspeakers, and the reverberation presence is provided by electro-dynamic loudspeakers, emitting reverberation signals synthesized with reverberation control filters. Through the objective experiments, we have confirmed that this method can construct the sharp sound image with reverberation presence. In this paper, we conduct subjective experiments to confirm if the listeners can perceive the reverberation presence provided by the proposed method. In particular, we evaluated the sharpness and direction of the sound image, and the reverberation presence with objective and subjective evaluation experiments. From the subjective evaluation, we confirmed that the reverberation presence can be perceived.
  • Kenta Iwai, Hiromu Suzuki, Takanobu Nishiura
    Applied Sciences, 12(4) 1994-1994, Feb 14, 2022  Peer-reviewedLead authorCorresponding author
    In this paper, we propose a three-dimensional (3-D) sound image reproduction method based on spherical harmonic (SH) expansion for 22.2 multichannel audio. 22.2 multichannel audio is a 3-D sound field reproduction system that has been developed for ultra-high definition television (UHDTV). This system can reproduce 3-D sound images by simultaneously driving 22 loudspeakers and two sub-woofers. To control the 3-D sound image, vector base amplitude panning (VBAP) is conventionally used. VBAP can control the direction of 3-D sound image by weighting the input signal and emitting it from three loudspeakers. However, VBAP cannot control the distance of the 3-D sound image because it calculates the weight by only considering the image’s direction. To solve this problem, we propose a novel 3-D sound image reconstruction method based on SH expansion. The proposed method can control both the direction and distance of the 3-D sound image by controlling the sound directivity on the basis of spherical harmonics (SHs) and mode matching. The directivity of the 3-D sound image is obtained in the SH domain. In addition, the distance of the 3-D sound image is represented by the mode strength. The signal obtained by the proposed method is then emitted from loudspeakers and the 3-D sound image can be reproduced accurately with consideration of not only the direction but also the distance. A number of experimental results show that the proposed method can control both the direction and distance of 3-D sound images.
  • Peng Chen, Haonan Wang, Kenta Iwai, Takanobu Nishiura
    NCSP'22, 257-260, Feb, 2022  Peer-reviewed
  • Yuna Harada, Naoto Shimada, Haonan Wang, Kenta Iwai, Masato Nakayama, Takanobu Nishiura
    2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1000-1007, Dec, 2021  Peer-reviewed

Misc.

 1

Presentations

 87

Teaching Experience

 4

Research Projects

 3

研究テーマ

 4
  • 研究テーマ(英語)
    アクティブノイズコントロールシステムの騒音低減範囲拡大
    研究期間(開始)(英語)
    2025
  • 研究テーマ(英語)
    アクティブノイズコントロールシステムの低遅延化
    研究期間(開始)(英語)
    2019
  • 研究テーマ(英語)
    音響エコー・雑音キャンセラのための適応アルゴリズムの安定化
    研究期間(開始)(英語)
    2021
  • 研究テーマ(英語)
    非線形ディジタルフィルタを用いた音響再生機器の高音質化
    研究期間(開始)(英語)
    2010