Robot with two ears listens to more than two simultaneous utterances by exploiting harmonic structures

Yasuharu Hirasawa, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6703(1) 348-358 2011年査読有り
Cluster Self-organization of Known and Unknown Environmental Sounds Using Recurrent Neural Network

Yang Zhang, Shun Nishide, Toru Takahashi, Hiroshi G. Okuno, Tetsuya Ogata

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT I 6791 167-175 2011年査読有り
Fast and Simple Iterative Algorithm of Lp-norm Minimization for Under-determined Speech Separation

Yasuharu Hirasawa, Naoki Yasuraoka, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 1756-1759 2011年査読有り
Use of a Sparse Structure to Improve Learning Performance of Recurrent Neural Networks

Hiromitsu Awano, Shun Nishide, Hiroaki Arie, Jun Tani, Toru Takahashi, Hiroshi G. Okuno, Tetsuya Ogata

NEURAL INFORMATION PROCESSING, PT III 7064 323-+ 2011年査読有り
Method of Discriminating Known and Unknown Environmental Sounds using Recurrent Neural Network

Yang Zhang, Tetsuya Ogata, Shun Nishide, Toru Takahashi, Hiroshi G. Okuno

in Proc. of Joint 5th Int. Conf. on Soft Computing and Intelligent Systems and 11th International Symposium on advanced Intelligent Systems (SCIS & ISIS 2010) 378-383 2010年12月査読有り
テルミンの音高・音量特性のモデルに基づくテルミン演奏ロボットの開発

水本武志, 辻野広司, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

情報処理学会論文誌 51(10) 2007-2019 2010年10月15日

本論文では，テルミンを演奏するロボットのためのテルミンの特性モデルと演奏動作生成手法について報告する．テルミンとは，演奏者の手の位置を動かして演奏する電子楽器である．楽器との物理的接触なしで連続的に音高と音量を操作できるので，ハードウェア構成が異なるロボットにも適用可能であるという点で，移植性が高い．テルミン演奏ロボットの主たる課題は，(1)動作生成の物理的な基準点がないので，演奏法学習に要する学習サンプル数が多くなること，および(2)演奏特性が静電的環境によって変化するので，適応的な演奏動作生成が必要であることの2点である．これらの課題に対して，我々は環境の影響をパラメータで表現した音高・音量特性モデルを構築し，少数の測定で音域内の任意の音を演奏できる制御手法を開発した．実験の結果，約12点の測定で音高が任意に制御できること，環境が変化しても所望の音高や音量で演奏できることを3種類のロボットで確認した．We present a theremin player robot towards an ensemble between humans and robots. A theremin, whose pitch and volume change continuously, can be played without any physical contacts. We thus expect that a robot system has high portability because it requires only few physical constraints. The problems for theremin playing are: (1) we have no physical reference points and (2) an environment affects sound characteristics seriously. To solve them, we develop a model-based feedforward arm control method based on our novel models of theremin's pitch and volume characteristics, which method realizes play an arbitrary sound using a few measurements. Experimental results show that our method works under four environments and three different robots.
Two-level Synchronization using Particle Filter for Co-player Music Robots

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of IEEE/RSJ-2010 Workshop on Robots and Musical Expression,CD-ROM 2010年10月査読有り
Integration of flutist gesture recognition and beat tracking for human-robot ensemble

Takeshi Mizumoto, Angelica Lim, Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of IEEE/RSJ-2010 Workshop on Robots and Musical Expression,CD-ROM 159-171 2010年10月査読有り
Programming by Playing and Approaches for Expressive Robot Performances

Angelica Lim, Takeshi Mizumoto, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of IEEE/RSJ-2010 Workshop on Robots and Musical Expression,CD-ROM 2010年10月査読有り
SpeakBySinging: Converting Singing Voices to Speaking Voices While Retaining Voice Timbre

Shinpei Aso, Takuya Saitou, Masataka Goto, Katsutoshi Itoyama, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of the 13th International Conference on Digital Audio Effects (DAFx-10) 2010年9月査読有り
AI-1-3 ロボット聴覚オープンソースソフトウエアHARK(AI-1.マルチモーダル信号処理とその応用,依頼シンポジウム,ソサイエティ企画)

奥乃博, 中臺一博, 高橋徹

電子情報通信学会ソサイエティ大会講演論文集 2010 "SS-72"-"SS-73" 2010年8月31日
Query-by-Conducting: An interface to retrieve classical-music interpretations by real-time tempo input

Akira Maezawa, Katsutoshi Itoyama, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of 11th International Conference on Musical Information Retreival (ISMIR-2010) 2010年8月査読有り
多重奏音響信号中の演奏をユーザー指定の旋律に差し替えるフレーズ置換システム

安良岡直希, 糸山克寿, 吉岡拓也, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

研究報告音楽情報科学（MUS） 2010(20) 1-8 2010年7月21日

フレーズ置換とは，多重奏音響信号から特定パート演奏をユーザー指定の別楽譜による演奏に差し替えるものである．これは，1) 元々のフレーズ演奏成分を除去する音源分離の課題と，2）元演奏の音色や演奏表情を新しい演奏上で再現する演奏合成の課題からなる．我々は調波非調波Gaussian Mixture Model (GMM) による置換対象演奏モデルとNonnegative Matrix Factorizationによる伴奏モデルを用いて音源分離を行い，同時に調波非調波GMMから得た基本周波数，倍音強度などの音響特徴を新しい演奏楽譜のMIDI音源音響信号に転写することで元演奏の音響特性を持つ新しい演奏を合成する．本フレーズ置換法に対し1) 元の演奏が正しく除去されるか，2) 新しい演奏は元演奏の特徴を保持しているか，の2点を客観評価し，提案法の有効性を示す．This paper presents a music manipulating system that enables a user to replace an instrument performance phrase in polyphonic audio mixture. Two technical problems must be solved to realize this system: 1)separating the melody part from accompaniment, and 2)synthesizing a new instrument performance that has timbre and expression of the original one. Our method first performs the separation using statistical model integrating harmonic and inharmonic Gaussian mixture and nonnegative-matrix-factorization. Then our method synthesizes a new instrument performance by adding the acoustic characteristics given by Gaussian mixture parameters to a MIDI synthesizer-generated sound. Two evaluations confirm the effectiveness of the proposed method.
クラシック音楽理解力拡張インタフェースを目指して : 複数の演奏家による解釈共通旋律と解釈相違旋律の推定

前澤陽, 後藤真孝, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 143-144 2010年3月8日
調波非調波GMMに基づくMIDI演奏音響信号に対する音色・演奏表情操作

安良岡直希, 糸山克寿, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 183-184 2010年3月8日
Robot Musical Accompaniment : Real-time Synchronization using Visual Cue Recognition

水本武志, 大塚琢馬, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博, 奥乃博

全国大会講演論文集 72 201-202 2010年3月8日
複数自由度を用いて音高特性モデルに基づく音高制御を行うテルミン演奏ロボットの開発

水本武志, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 203-204 2010年3月8日
スペクトル推定を用いたマイク数以上の同時発話に対する音声認識

平澤恭治, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 253-254 2010年3月8日
環境音から擬音語への自動変換における特徴量抽出法の検討

山川暢英, 北原鉄朗, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 257-258 2010年3月8日
ユーザの文法知識を状態に加えたPOMDPに基づく音声対話システム

穐山空道, 駒谷和範, 高橋徹, 尾形哲也, 奥乃博

全国大会講演論文集 72 291-292 2010年3月8日
F0・振幅・音韻長の制御により歌声を話声に変換する話声合成システムSpeakBySinging

阿曽慎平, 齋藤毅, 後藤真孝, 糸山克寿, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 295-296 2010年3月8日
RNNを用いた行為予測による人間とロボットの協調物体配置

粟野皓光, 尾形哲也, 高橋徹, 駒谷和範, 奥乃博

全国大会講演論文集 72 395-396 2010年3月8日
MTRNNを用いた単語と文法の階層的自己組織化による文の認識・生成

日下航, 有江浩明, 谷淳, 尾形哲也, 高橋徹, 駒谷和範, 奥乃博

全国大会講演論文集 72 525-526 2010年3月8日
ロボット音声対話におけるSemi-blind ICAを用いた自己発話キャンセル

武田龍, 中臺一博, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 27-28 2010年3月8日
実環境音声認識のためのロボット聴覚システム開発とパラメータチューニング

高橋徹, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 72 29-30 2010年3月8日
バージイン許容音声対話におけるLSMによる許容発話範囲の拡張

松山匡子, 駒谷和範, 高橋徹, 尾形哲也, 奥乃博

全国大会講演論文集 72 129-130 2010年3月8日
ロボット聴覚のためのMatching‐Pursuitによる環境音の分離音認識

山川暢英, 高橋徹, 北原鉄朗, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 28th ROMBUNNO.1H2-4 2010年
Improvement in Listening Capability for Humanoid Robot HRP-2

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 470-475 2010年査読有り
Upper-limit Evaluation of Robot Audition based on ICA-BSS in Multi-source, Barge-in and Highly Reverberant Conditions

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 4366-4371 2010年査読有り
Music-Ensemble Robot That Is Capable of Playing the Theremin While Listening to the Accompanied Music

Takuma Otsuka, Takeshi Mizumoto, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS 6096 102-+ 2010年査読有り
Improving Identification Accuracy by Extending Acceptable Utterances in Spoken Dialogue System Using Barge-in Timing

Kyoko Matsuyama, Kazunori Komatani, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT II, PROCEEDINGS 6097 585-594 2010年査読有り
Violin Fingering Estimation Based on Violin Pedagogical Fingering Model Constrained by Bowed Sequence Estimation from Audio Input

Akira Maezawa, Katsutoshi Itoyama, Toru Takahashi, Kazunori Komatani, Tetsnya Ogata, Hiroshi C. Okuno

TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT III, PROCEEDINGS 6098 249-259 2010年査読有り
Design and Implementation of Two-Level Synchronization for an Interactive Music Robot

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10) 1238-1244 2010年査読有り
Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems

Hideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 38-+ 2010年査読有り
Analyzing User Utterances in Barge-in-able Spoken Dialogue System for Improving Identification Accuracy

Kyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 3050-3053 2010年査読有り
Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognition

Nobuhide Yamakawa, Tetsuro Kitahara, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 2342-+ 2010年査読有り
Human-Robot Cooperation in Arrangement of Objects Using Confidence Measure of Neuro-dynamical System

Hiromitsu Awano, Tetsuya Ogata, Shun Nishide, Torn Takahashi, Kazunori Komatani, Hiroshi G. Okuno

IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010) 2010年査読有り
Speedup and Performance Improvement of ICA-based Robot Audition by Parallel and Resampling-based Block-wise Processing

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 1949-1956 2010年査読有り
Human-Robot Ensemble between Robot Thereminist and Human Percussionist using Coupled Oscillator Model

Takeshi Mizumoto, Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 1957-1963 2010年査読有り
Robot Musical Accompaniment: Integrating Audio and Visual Cues for Real-time Synchronization with a Human Flutist

Angelica Lim, Takeshi Mizumoto, Louis-Kenzo Cahier, Takuma Otsuka, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 1964-1969 2010年査読有り
An Improvement in Automatic Speech Recognition Using Soft Missing Feature Masks for Robot Audition

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 964-969 2010年査読有り
Motion Generation Based on Reliable Predictability using Self-organized Object Features

Shun Nishide, Tetsuya Ogata, Jun Tani, Toru Takahashi, Kazunori Komatani, Hiroshi G. Okuno

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 2010年査読有り
Exploiting Harmonic Structures to Improve Separating Simultaneous Speech in Under-Determined Conditions

Yasuharu Hirasawa, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 450-457 2010年査読有り
音響信号と音楽的制約を統合したバイオリンの演奏弦系列の推定

前澤陽, 糸山克寿, 高橋徹, 尾形哲也, 奥乃博

研究報告音楽情報科学（MUS） 2009(5) 1-6 2009年7月22日

本報告ではコンテキストベースの規則と音響信号を併用したバイオリン演奏弦系列推定手法を提案する．音響信号から演奏弦系列を推定し，それの規則に合わない箇所を訂正することにより認識率の向上を図る．6 楽節での実験の結果，学習データと同一の弦の場合最大8%，平均 5%，別の銘柄の弦の場合最大 15%，平均 7% の認識率の向上が確認される．We present a violin bowed string sequence identification method by combining context-based rules and audio-based bowed string estimator. Using audio-based estimator followed by error correction using context-based rules increases the accuracy of the estimator. Using six musical phrases, we confirm that the accuracy increases on average by 5% (max. 8%) when using the set of strings used for training, and, when using different brand of strings than that used for training, confirm 7% increase on average (max. 15%).
残差スペクトルモデルによる伴奏・残響成分抑制に基づいた楽器演奏分析合成の高精度化

安良岡直希, 糸山克寿, 高橋徹, 尾形哲也, 奥乃博

研究報告音楽情報科学（MUS） 2009(10) 1-6 2009年7月22日

本報告書では，楽器演奏音響信号の分析合成における，入力中の伴奏音や残響成分を抑制した分析手法を報告する．対象演奏パートの楽譜情報に合致しないスペクトル成分を表現する残差スペクトルモデルを導入し, これを用いて伴奏や残響を含む音響信号から対象の演奏を効率よく分離する. 調波非調波統合音モデルに用いた演奏分析をこの分離と同時に行い, 分析された音モデルを用いて未知楽譜への演奏を合成する.評価実験では, 伴奏付き演奏に対する分析精度が本手法によりスペクトル距離において平均 35.2% 改善し, また残響を含む演奏に対する分析合成精度の低下を回避できる事が確認された.This paper presents a musical performance analysis-and-synthesis method using residual model for reduction of accompaniment or sound reverberation. The residual model is designed for representing spectrum that the score does not convey about the performance. This leads to an efficient extraction of a performed part from accompanied and/or reverberant audio source. The extraction is performed simultaneously with estimation of musical tone models that represent both harmonic and inharmonic sound of the performance. Using the estimated tone models, a new performance sound corresponding to a new given score is synthesized. An experiment showed that the spectral distance of one instrument part extracted from polyphonic audio source improved by 35.0 points by incorporating the residual model. Another result showed the effectiveness of our method under reverberant source.
フィールドにおける音源定位のための音声視覚化デバイス「カエルホタル」の設計

水本武志, 合原一究, 高橋徹, 尾形哲也, 奥乃博

全国大会講演論文集 71 169-170 2009年3月10日
ソフトマスクと音響モデル適応を用いた3話者同時発話音声認識

高橋徹, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

全国大会講演論文集 71 35-36 2009年3月10日
ロボットによる卓上物体操作のためのRNNを用いた道具身体化モデルの構築

中川達裕, 尾形哲也, 谷淳, 高橋徹, 奥乃博

全国大会講演論文集 71 53-54 2009年3月10日
音声認識と言語理解を動的に選択する音声理解フレームワーク

勝丸真樹, 中野幹生, 駒谷和範, 成松宏美, 船越孝太郎, 辻野広司, 高橋徹, 尾形哲也, 奥乃博

全国大会講演論文集 71 117-118 2009年3月10日
音声対話システムにおける想定外発話の文法検証を用いた対話行為推定に基づくヘルプ生成

池田智志, 駒谷和範, 高橋徹, 尾形哲也, 奥乃博

全国大会講演論文集 71 121-122 2009年3月10日

書籍等出版物

商工振興, 大阪府商工労働部監修

高橋徹, 能勢和夫, 塚本直幸, 吉川耕司, 今中雄一, 向山敏成, 田中明城, 松下功, 木村祥法, 村田菜美 (担当:共著)

公益財団法人大阪府工業会編集発行 2015年11月
Lecture Notes in Computer Science 6791

Zhang Yang, Tetsuya Ogata, Shun Nishide, Toru Takahashi, Hiroshi G. Okuno

Springer 2011年6月
Lecture Notes in Artificial Intelligence 6704

Nobuhide Yamakawa, Toru Takahashi, Tetsuro Kitahara, Tetsuya Ogata, Hiroshi G. Okuno

Springer, Syracuse, NY 2011年6月
Lecture Notes in Artificial Intelligence 6703

Yasuharu Hirasawa, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Springer, Syracuse, NY 2011年6月
Lecture Notes in Artificial Intelligence 6098

Kyoko Matsuyama, Kazunori Komatani, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

2010年6月

もっとみる

講演・口頭発表等

大学でのオンラインコンテンツ活用の検討と課題

高橋徹オンライン開催, September

大学電気系教育協議会;大学電気工学教育研究集会; 2022年9月20日招待有り
Moving sound source tracking in wide space by multiple microphone arrays

高橋徹, 江川琢真, 中山雅人

13th Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2021年12月 Asia Pacific Signal and Information Processing Association
Virtual sound source rendering based on distance control to penetrate listeners using surround parametric-array and electrodynamic loudspeakers

江川琢真, 中山雅人, 高橋徹

13th Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2021年12月 Asia Pacific Signal and Information Processing Association
Design-oriented information system development methods for the information-impaired people

Koji Yamada, Toru Takahashi, Sakiko Kasuya, Nobuhiro Kataoka

International Workshop on Informatics 2021 2021年9月13日 Informatics Society
骨伝導イヤホンの提示音量校正方法の検討

秋山尚輝, 高橋徹, 赤塚俊洋, 江川琢真, 中山雅人

日本音響学会2021年秋季研究発表会 2021年9月7日一般社団法人日本音響学会