研究者業績

岩居 健太

イワイ ケンタ  (IWAI KENTA)

基本情報

所属
大阪産業大学 情報デザイン学部 准教授
関西大学 先端科学技術推進機構 客員研究員
学位
博士 (工学)(関西大学)

J-GLOBAL ID
201801004052332030
researchmap会員ID
7000024135

外部リンク

論文

 113
  • 岩居 健太
    電子情報通信学会技術研究報告 124(390) 275-280 2025年3月  筆頭著者最終著者責任著者
  • 山口 翔, 豊岡 祥太, 岩居 健太, 喜多 俊輔, 梶川 嘉延
    電子情報通信学会技術研究報告 124(389) 481-485 2025年3月  
  • 塚原 大樹, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告 124(390) 269-274 2025年3月  
  • 松浦 亮, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告 124(389) 180-185 2025年3月  
  • 水谷 真絃, 岩居 健太, 西浦 敬信, 添田 喜治
    電子情報通信学会技術研究報告 124(389) 146-151 2025年3月  
  • Binh Thien Nguyen, Yukoh Wakabayashi, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Acoustical Science and Technology 46(2) 186-190 2025年3月1日  査読有り
  • Shota Toyooka, Kenta Iwai, Yoshinobu Kajikawa
    Acoustical Science and Technology 2025年  査読有り
  • 塚原 大樹, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告 124(318) 50-56 2024年12月  
  • Kenta Iwai, Takanobu Nishiura
    2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1-5 2024年12月  査読有り筆頭著者責任著者
  • Hayata Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1-6 2024年12月  査読有り
  • Yuki Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1-5 2024年12月  査読有り
  • Maoto Mizutani, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    2024 IEEE 13th Global Conference on Consumer Electronics (GCCE) 415-418 2024年10月29日  査読有り
  • Peng Chen, Binh Thien Nguyen, Kenta Iwai, Takanobu Nishiura
    Information 15(10) 608-608 2024年10月4日  査読有り
    An effective approach to addressing the speech separation problem is utilizing a time–frequency (T-F) mask. The ideal binary mask (IBM) and ideal ratio mask (IRM) have long been widely used to separate speech signals. However, the IBM is better at improving speech intelligibility, while the IRM is better at improving speech quality. To leverage their respective strengths and overcome weaknesses, we propose an ideal threshold-based mask (ITM) to combine these two masks. By adjusting two thresholds, these two masks are combined to jointly act on speech separation. We list the impact of using different threshold combinations on speech separation performance under ideal conditions and discuss a reasonable range for fine tuning the thresholds. By using masks as a training target, to evaluate the effectiveness of the proposed method, we conducted supervised speech separation experiments applying a deep neural network (DNN) and long short-term memory (LSTM), the results of which were measured by three objective indicators: the signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifact ratio improvement (SAR). Experimental results show that the proposed mask combines the strengths of the IBM and IRM and implies that the accuracy of speech separation can potentially be further improved by effectively leveraging the advantages of different masks.
  • Subaru Kato, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    2024 IEEE 13th Global Conference on Consumer Electronics (GCCE) 411-414 2024年10月  査読有り
  • Yuki NAKANO, Kazumichi MIYAZATO, Yuting GENG, Kenta IWAI, Takanobu NISHIURA
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings 270(7) 4484-4493 2024年10月  査読有り
    Optical laser microphones have attracted attention for acoustic systems capable of recording the target speech from a distance. An optical laser microphone measures the speech-induced vibration by focusing the laser beam on the surface of the vibrating object. A recording method called rough-focus recording using an unfocused laser beam, enables wide-range recording and robust recording against changes in focal length. However, with rough-focus recording, the broad laser beam coverage leads to insufficient intensity of the reflected laser for accurate acoustical signal measurement, resulting in speech-quality degradation, such as the inclusion of noise, and attenuation of high-frequency components in the acquired speech. To solve this problem, deep-learning-based speech-enhancement methods for optical laser microphones have been proposed. Such methods require separate models for different focus settings, exhibiting a lack of adaptability to changing focus settings. We propose a speech-enhancement method for training a single model for various focus settings. This model is trained with speech signals recorded across different focus settings to enhance speech recorded in various focus settings. Experimental results indicate that the this model trained with the proposed method performs equivalent to or better than a model trained with the conventional models.
  • Maoto MIZUTANI, Kenta IWAI, Takanobu NISHIURA, Yoshiharu SOETA
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings 270(7) 4516-4525 2024年10月  査読有り
    In the active noise control (ANC), it is known that the optimal filter becomes a noncausal filter due to the effect of the nonminimum phase components of the secondary path. We have investigated how to make the causal optimal filter by replacing the nonminimum phase components with the frequency components of the primary path. Through the computer simulations, we have confirmed that this filter maintained some noise reduction performance and can be used as a quasi-optimal filter. In this paper, as another approach, we attempt to identify noncausal components using the pre-trained adaptive noise control filter. The nonminimum phase components of the secondary path exist in the frequency band below the lowest resonance frequency of the secondary loudspeaker. Based on this fact, we design a noise control filter by using band-limited white noise only with the low frequency components below the lowest resonance frequency in advance. Then, we design another adaptive noise control filter by using the band-limited white noise with high frequency components above the lowest resonance frequency. Finally, we combine these filters in the frequency domain. Through the computer simulations, we show the difficulty to handle the nonminimum phase components in the ANC system.
  • Peng Chen, Binh Thien Nguyen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    IEEE Access 12 152036-152044 2024年10月  査読有り
  • 松浦 亮, 豊岡 祥太, 岩居 健太, 梶川 嘉延
    電子情報通信学会技術研究報告 124(94) 23-28 2024年6月  
  • 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告 123(402) 294-299 2024年3月  筆頭著者責任著者
  • Peng Chen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Proceedings of International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2024) 43-46 2024年3月  査読有り
  • Yanqiao Yan, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Proceedings of International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2024) 47-50 2024年3月  査読有り
  • Kenta Iwai, Takanobu Nishiura
    APSIPA Transactions on Signal and Information Processing 12(1) 2023年12月  査読有り筆頭著者責任著者
  • Kenta Iwai, Takanobu Nishiura
    2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1150-1154 2023年10月  査読有り筆頭著者責任著者
  • Shota Naiki, Shumpei Miura, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1221-1225 2023年10月  査読有り
  • Hayata Nakano, Tsubasa Yoshizawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2320-2325 2023年10月  査読有り
  • Nguyen Binh Thien, Yukoh Wakabayashi, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    INTERSPEECH 2023 2023年8月20日  
  • 岩居健太
    電子情報通信学会技術研究報告 123(152) 71-76 2023年8月  招待有り筆頭著者最終著者責任著者
  • Tianyu Xie, Shota Toyooka, Kenta Iwai, Yoshinobu Kajikawa
    Proceedings of International Workshop on Smart Info-Media Systems in Asia 44-49 2023年8月  査読有り
  • Kohei Izawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings 268(6) 2456-2464 2023年8月  査読有り
    Cymbals are musical instruments consisting of round brass plates in slightly concave shape that are struck against another or struck with a stick to make sounds. Their vibration characteristics depend on the shape of cymbals, therefore, some cymbals have been specially shaped for desired characteristics, as called effect cymbals. Usually, it takes the craftsman numerous retrials in prototyping the effect cymbals to realize the desired characteristics. This leads to a large cost of prototyping effect cymbals. On the other hand, the cost can be significantly reduced if the optimal shape for the desired sound can be estimated in advance. However, to estimate the shape, it is necessary to clarify the relationship between the shape of a certain cymbal and its vibration characteristics. In this paper, we focus on the effect cymbals with holes. Aiming to estimate the optimal hole shape for desired sound, we investigate the relationship between the hole shape of hole effect cymbals and their vibration characteristics based on frequency response analysis by finite element method. We analyzed multiple models with different hole sizes for each cymbal, the analysis results show that the vibration tends to be simpler as the hole size is larger.
  • Shota Naiki, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings 268(6) 2465-2476 2023年8月  査読有り
    In this paper, we propose a feedforward active noise control (ANC) system with an optical laser microphone utilizing a proportional-integral-differential (PID) filter. The feedforward ANC system with the optical laser microphone has been proposed to relax the causality constraint. This system adopts a first-order differentiator to modify the velocity picked up by the optical laser microphone. This modification achieves the improvement of the coherence between the velocity as the reference signal and the unwanted noise as the sound pressure. However, the frequency response of the first-order differentiator is similar to the high-pass filter and attenuates the reference signal at low frequencies. This causes the degradation of the noise reduction performance of the conventional ANC system. To solve this problem, the proposed ANC system adopts the PID filter instead of the first-order differentiator. By adjusting the coefficients of the PID filter, the proposed system avoids the attenuation of the reference signal at low frequencies. Then, the noise reduction performance can be improved. Experimental results show that the proposed ANC system reduces the unwanted noise compared to the conventional ANC system and relaxes the causality constraint compared to the basic ANC system in the real-time operation.
  • Hayata Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings 268(6) 2444-2455 2023年8月  査読有り
    Recent studies have been proposed to extract speech signals from captured videos of objects vibrating by sound waves. Among them, a method for extracting speech signals from videos captured by a rolling-shutter camera, which is widely used, has been attracting attention. A rolling-shutter camera records image data in one row of pixels at a time, thereby capturing the vibration of objects caused by sound waves. However, there are time intervals between frames of the videos, resulting in missing segments in the extracted speech signals. The conventional method uses an autoregressive model to interpolate these missing segments. However, the conventional method ignores the noise in the extracted speech signals, and therefore the noise remains. In this paper, we propose a method to interpolate missing segments based on singular spectrum analysis, which considers the noise to further improve speech quality, by dual rolling-shutter cameras. By using the singular spectral analysis, the missing segments can be determined using only the speech components in the signals, which are related to large singular value, thereby reducing the noise. Experimental results show that the proposed method outperforms the conventional methods in terms of quality and intelligibility of the extracted speech signals.
  • Chengkai CAI, Kenta IWAI, Takanobu NISHIURA
    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E106.A(4) 647-656 2023年4月1日  査読有り
  • 中野 隼汰, 耿 毓庭, 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告 122(387) 2023年3月  
  • 内木 正太, 岩居 健太, 西浦 敬信, 添田 喜治
    電子情報通信学会技術研究報告 122(387) 2023年3月  
  • 井澤 幸平, 耿 毓庭, 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告 122(387) 2023年3月  
  • 岩居 健太, 西浦 敬信
    電子情報通信学会技術研究報告 122(388) 67-72 2023年3月  筆頭著者責任著者
  • Peng Chen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    NCSP'23 102-105 2023年3月  査読有り
  • Chengkai Cai, Kenta Iwai, Takanobu Nishiura
    Applied Sciences 13(3) 1958-1958 2023年2月2日  査読有り
    The development of distant-talk measurement systems has been attracting attention since they can be applied to many situations such as security and disaster relief. One such system that uses a device called a laser Doppler vibrometer (LDV) to acquire sound by measuring an object’s vibration caused by the sound source has been proposed. Different from traditional microphones, an LDV can pick up the target sound from a distance even in a noisy environment. However, the acquired sounds are greatly distorted due to the object’s shape and frequency response. Due to the particularity of the degradation of observed speech, conventional methods cannot be effectively applied to LDVs. We propose two speech enhancement methods that are based on two-stage processing with deep neural networks for LDVs. With the first proposed method, the amplitude spectrum of the observed speech is first restored. The phase difference between the observed and clean speech is then estimated using the restored amplitude spectrum. With the other proposed method, the low-frequency components of the observed speech are first restored. The high-frequency components are then estimated by the restored low-frequency components. The evaluation results indicate that they improved the observed speech in sound quality, deterioration degree, and intelligibility.
  • Nguyen Binh Thien, Yukoh Wakabayashi, Kenta Iwai, Takanobu Nishiura
    IEEE/ACM Transactions on Audio, Speech, and Language Processing 31 1667-1680 2023年  
  • Takumi Miyake, Kenta Iwai, Yoshinobu Kajikawa
    IEEE Access 11 6935-6943 2023年  査読有り責任著者
  • Nguyen Binh Thien, Yukoh Wakabayashi, Geng Yuting, Kenta Iwai, Takanobu Nishiura
    2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 958-962 2022年11月7日  査読有り
  • Kenta Iwai, Takanobu Nishiura
    2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 281-285 2022年11月7日  査読有り筆頭著者責任著者
  • Yanqiao Yan, Binh Thien Nguyen, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022 1673-1677 2022年11月  査読有り
    Audio super-resolution (ASR) is a complicated task for generating a high-resolution audio signal from a low-resolution signal. To solve this problem, we propose an ASR system for music signals that involves using deep neural networks in the time-frequency domain. The system has two components: a Wasserstein generative adversarial network-based high frequency magnitude generation model and a fully connected network-based corresponding high frequency band phase estimation model. The conventional high frequency band phase estimation methods require large computational complexity, have slow convergence, and reconstruct low quality high-resolution signals. We compare our proposed high frequency band phase estimation model in the ASR system with conventional phase estimation methods. The results show that our proposed phase estimation model outperforms conventional methods in objective evaluations.
  • Yuya Nakahira, Kenta Iwai, Yoshinobu Kajikawa
    Applied Sciences 12(21) 10710-10710 2022年10月22日  査読有り
    Nonlinear distortion in loudspeaker systems degrades sound quality and must be properly compensated for by linearization techniques. One technique to reduce nonlinear distortion is to use a Volterra Filter, which approximates the nonlinearity of the target loudspeaker using the Volterra series expansion. In general, the Volterra Filter is computationally very expensive, and the amount of computation needs to be reduced for real-time processing. In this paper, we propose an efficient implementation of the third-order Volterra filter based on singular value decomposition. The proposed method determines the necessary coefficients based on the symmetry of the third-order Volterra filter and applies singular value decomposition to them. In the filter structure consisting of singular values and their corresponding singular vector, the computational complexity of the third-order Volterra filter can be reduced by eliminating the part of the filter with small singular values. By focusing on the magnitude of the singular values, the proposed method can improve the computational efficiency of the third-order Volterra filter without decreasing its approximation accuracy. Simulation results show that the proposed method can improve the computational efficiency by 60% while maintaining the nonlinear distortion compensation performance of the micro-speaker for smartphones by about 8 dB.
  • Koki Nakamura, Kenta Iwai, Takanobu Nishiura
    International Congress on Acoustics 2022 2022年10月  査読有り
  • Tsubasa Yoshizawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura
    Acoustics 2022 2022年10月  査読有り
  • Yuna Harada, Yuting Geng, Kenta Iwai, Masato Nakayama, Takanobu Nishiura
    INTER-NOISE and NOISE-CON Congress and Conference Proceedings 265(4) 3464-3471 2022年8月  査読有り
    3-D sound field reproduction systems can provide a high presence. These systems commonly use electro-dynamic loudspeakers. Electro-dynamic loudspeakers tend to construct diffuse sound images due to its wide directivity and high reverberation. In contrast, parametric array loudspeakers can construct sharp sound images due to its sharp directivity and low reverberation. On the other hand, it is difficult to provide reverberation presence by parametric array loudspeakers because of the sharp directivity. We have previously proposed a sharp sound image construction based on reverberation control with surround sound system using parametric and electro-dynamic loudspeakers. In this method, the sharp sound image is rendered using parametric array loudspeakers, and the reverberation presence is provided by electro-dynamic loudspeakers, emitting reverberation signals synthesized with reverberation control filters. Through the objective experiments, we have confirmed that this method can construct the sharp sound image with reverberation presence. In this paper, we conduct subjective experiments to confirm if the listeners can perceive the reverberation presence provided by the proposed method. In particular, we evaluated the sharpness and direction of the sound image, and the reverberation presence with objective and subjective evaluation experiments. From the subjective evaluation, we confirmed that the reverberation presence can be perceived.
  • Kenta Iwai, Hiromu Suzuki, Takanobu Nishiura
    Applied Sciences 12(4) 1994-1994 2022年2月14日  査読有り筆頭著者責任著者
    In this paper, we propose a three-dimensional (3-D) sound image reproduction method based on spherical harmonic (SH) expansion for 22.2 multichannel audio. 22.2 multichannel audio is a 3-D sound field reproduction system that has been developed for ultra-high definition television (UHDTV). This system can reproduce 3-D sound images by simultaneously driving 22 loudspeakers and two sub-woofers. To control the 3-D sound image, vector base amplitude panning (VBAP) is conventionally used. VBAP can control the direction of 3-D sound image by weighting the input signal and emitting it from three loudspeakers. However, VBAP cannot control the distance of the 3-D sound image because it calculates the weight by only considering the image’s direction. To solve this problem, we propose a novel 3-D sound image reconstruction method based on SH expansion. The proposed method can control both the direction and distance of the 3-D sound image by controlling the sound directivity on the basis of spherical harmonics (SHs) and mode matching. The directivity of the 3-D sound image is obtained in the SH domain. In addition, the distance of the 3-D sound image is represented by the mode strength. The signal obtained by the proposed method is then emitted from loudspeakers and the 3-D sound image can be reproduced accurately with consideration of not only the direction but also the distance. A number of experimental results show that the proposed method can control both the direction and distance of 3-D sound images.
  • Peng Chen, Haonan Wang, Kenta Iwai, Takanobu Nishiura
    NCSP'22 257-260 2022年2月  査読有り
  • Yuna Harada, Naoto Shimada, Haonan Wang, Kenta Iwai, Masato Nakayama, Takanobu Nishiura
    2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1000-1007 2021年12月  査読有り

MISC

 1

講演・口頭発表等

 87

担当経験のある科目(授業)

 4

共同研究・競争的資金等の研究課題

 3

研究テーマ

 4
  • 研究テーマ
    アクティブノイズコントロールシステムの騒音低減範囲拡大
    研究期間(開始)
    2025
  • 研究テーマ
    アクティブノイズコントロールシステムの低遅延化
    研究期間(開始)
    2019
  • 研究テーマ
    音響エコー・雑音キャンセラのための適応アルゴリズムの安定化
    研究期間(開始)
    2021
  • 研究テーマ
    非線形ディジタルフィルタを用いた音響再生機器の高音質化
    研究期間(開始)
    2010