Code

Some of the code I use for uncertainty propagation is available here. There are also instructions on how to reproduce experiments of some of the published papers.

Publications

@L2F, INESC-ID (2010-*)

[Astudillo2014c] M. Matassoni, R. F. Astudillo, A. Katsamanis, M. Ravanelli, "The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones", to appear in Proc. INTERSPEECH, 2014.

[Astudillo2014b] R. F. Astudillo and Sebastian Braun and Emanuël A., P. Habets, "A Multichannel Feature Compensation Approach for Robust ASR in Noisy and Reverberant Environments", REverberant Voice Enhancement and Recognition Benchmark (REVERB) workshop, 2014.

[Astudillo2014] R. F. Astudillo, "ACCOUNTING FOR THE RESIDUAL UNCERTAINTY OF MULTI-LAYER PERCEPTRON BASED FEATURES", Proc. ICASSP, 2014.

[Astudillo2013e] R. F. Astudillo, "A Propagation Approach to Modelling the Joint Distributions of Clean and Corrupted Speech in the Mel-Cepstral Domain", Proc. ASRU 2013, pp. 180-185, 2013.

[Abad2013] A. Abad and R. F. Astudillo, "The L2F Spoken Web Search system for Mediaeval 2013", Mediaeval, 2013.

[Astudillo2013d] R. F. Astudillo, "An extension of STFT Uncertainty Propagation for GMM based Super-Gaussian a priori Models", IEEE Signal Processing Letters, Vol. 20 (12), pp. 1163 - 1166, 2013.

[Kolossa2013] D. Kolossa and S. Zeiler and R. Saeidi and R. F. Astudillo, "Noise-adaptive LDA: A new approach for speech recognition under observation uncertainty", IEEE Signal Processing Letters, Vol. 20 (11), pp. 1018 - 1021, 2013.

[Nesta2013] F. Nesta, M. Matassoni, R. F. Astudillo, "A flexible spatial blind source extraction framework for robust speech recognition in noisy environments", Proc. of 2nd International Workshop on Machine Listening in Multisource Environments CHiME, pp 33-38, 2013.

[Astudillo2013c] R. F. Astudillo, T. Gerkmann, "On the relation between speech corruption models in the spectral and the cepstral domain", Proc. of ICASSP, pp 7044-7048, 2013.

[Astudillo2013b] R. M. Nickel, R. F. Astudillo, D. Kolossa, R. Martin, "Corpus-Based Speech Enhancement with Uncertainty Modeling and Cepstral Smoothing", IEEE Transactions on Audio, Speech and Language Processing, Vol. 21 (5), pp 983-997, 2013.

[Astudillo2013] R. F. Astudillo, R. Orglmeister, "Computing MMSE Estimates and Residual Uncertainty directly in the Feature Domain of ASR using STFT Domain Speech Distortion Models", IEEE Transactions on Audio, Speech and Language Processing, Vol. 21 (5), pp 1023-1034, 2013.

[Astudillo2012e] R. F. Astudillo, D. Kolossa, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J. P. Neto, R. Martin, "Integration of beamforming and uncertainty-of-observation techniques for robust ASR in multi-source environments", Computer Speech & Language, Vol. 27 (3), pp 837-850, 2013.

[Abad2012] A. Abad, R. F. Astudillo, "The L2F Spoken Web Search system for Mediaeval 2012", Proc. of Mediaeval, 2012.

[Astudillo2012b] R. F. Astudillo, A. Abad, J. P. Neto, "Uncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR", Proc. of Interspeech, 2012.

[Pellegrini2012] T. Pellegrini, H. Moniz, F. Batista, I. Trancoso, R. F. Astudillo, "Extension of the LECTRA corpus: classroom LECture TRAnscriptions in European Portuguese", Proc. of SPEECH AND CORPORA, 2012.

[Astudillo2012] R. F. Astudillo, A. Abad, J. P. Neto, "Integration of Beamforming and Automatic Speech Recognition Through Propagation of the Wiener Posterior", Proc. of ICASSP, pp 4909--4912, 2012.

[Nickel2012] R. M. Nickel, R. F. Astudillo, D. Kolossa, S. Zeiler, R. Martin, "Inventory-Style Speech Enhancement with Uncertainty-of-Observation Techniques", Proc. of ICASSP, pp 4645--4648, 2012.

[Astudillo2012f] R. F. Astudillo, L. Deng, E. Vincent, "Uncertainty Handling for Environment-Robust Speech Recognition", Tutorial at Interspeech, 2012.

[Astudillo2011c] R. F. Astudillo, J. P. Neto, "Propagation of Uncertainty through Multilayer Perceptrons for Robust Automatic Speech Recognition", Proc. of Interspeech, pp 461--464, 2011.

[Kolossa2011] D. Kolossa, R. F. Astudillo, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J. P. Neto, R. Martin, "CHIME Challenge: Approaches to Robustness using Beamforming and Uncertainty-of-Observation Techniques", Proc. of Int. Workshop on Machine Listening in Multisource Environments, pp 6--11, 2011.

@EMSP, Technische Universität Berlin (2006-2010)

[Astudillo2011b] A. Vorwerk, S. Zeiler, D. Kolossa, R. F. Astudillo, D. Lerch, "Use of Missing and Unreliable Data for Audiovisual Speech Recognition", Proc. of Robust Speech Recognition of Uncertain or Missing Data - Theory and Applications, pp 345--375, 2011.

[Astudillo2011] R. F. Astudillo, D. Kolossa, "Uncertainty Propagation", Proc. of Robust Speech Recognition of Uncertain or Missing Data - Theory and Applications, pp 35--64, 2011.

[Astudillo2010d] R. F. Astudillo, E. Hoffman, P. Madelartz, R. Orglmeister, "Speech Enhancement for Automatic Speech Recognition using Complex Gaussian Mixture Priors for Noise and Speech", Proc. of Advances in Nonlinear Speech Processing, pp 60-67, 2010.

[Astudillo2010] Ramon Fernandez Astudillo, "Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition", Ph.D thesis, Technical University Berlin, 2010.

[Astudillo2010b] R. F. Astudillo, D. Kolossa, P. Mandelartz, R. Orglmeister, "An Uncertainty Propagation Approach to Robust ASR using the ETSI Advanced Front-End", IEEE JSTSP Special Issue on Speech Processing for Natural Interaction with Intelligent Environments, Vol. 4 (5), pp 824 - 833, 2010.

[Kolossa2010] D. Kolossa, R. F. Astudillo, E. Hoffmann, R. Orglmeister, "Independent Component Analysis and Time-Frequency Masking for Speech Recognition in Multi-Talker Conditions", Eurasip Journal on Audio, Speech, and Music Processing, 2010.

[Astudillo2010c] R. F. Astudillo, R. Orglmeister, "A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation", Proc. of Interspeech, pp 713-716, 2010.

[Kolossa2010b] D. Kolossa, R. F. Astudillo, S. Zeiler, A. Vorwerk, D. Lerch, J. Chong, R. Orglmeister, "Missing Feature Audiovisual Speech Recognition under Real-Time Constraints", Proc. of ITG Fachtagung Sprachkommunikation, pp 6-8, 2010.

[Astudillo2010e] R. F. Astudillo, E. Hoffmann, P. Mandelartz, Reinhold Orglmeister, "Speech Enhancement for Automatic Speech Recognition Using Complex Gaussian Mixture Priors for Noise and Speech", Proc. of Advances in Nonlinear Speech Processing, pp 60-67, 2010.

[Jeub2009] M. Jeub, D. Kolossa, R. F. Astudillo, R. Orglmeister, "Performance Analysis of Wavelet-based Voice Activity Detection", Proc. of NAG-DAGA, pp 407--408, 2009.

[Astudillo2009b] R. F. Astudillo, D. Kolossa, R. Orglmeister, "Accounting for the Uncertainty of Speech Estimates in the Complex Domain for Minimum Mean Square Error Speech Enhancement", Proc. of Interspeech, pp 2491-2494, 2009.

[Astudillo2009] R. F. Astudillo, E. Hoffman, P. Madelartz, R. Orglmeister, "Speech Enhancement for Automatic Speech Recognition using Supergaussian Noise Priors conditioned by Noise Power Estimation", Proc. of NOLISP, 2009.

[Astudillo2008] R. F. Astudillo, D. Kolossa, R. Orglmeister, "Uncertainty Propagation for Speech Recognition using RASTA Features in Highly Nonstationary Noisy Environments", Proc. of ITG Workshop for Speech Communication, 2008.

[Astudillo2007] R. F. Astudillo, D. Kolossa, R. Orglmeister, "Propagation of Statistical Information through non-linear Feature Extractions for Robust Speech Recognition", Proc. of 27th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, AIP, pp 245-252, 2007.

[Kolossa2007] D. Kolossa, R. F. Astudillo, R. Orglmeister, "Spracherkennung im Automobil durch Verwendung von Missing Feature Techniken", Proc. of DAGA, pp 301--302, 2007.

[Kolossa2006] D. Kolossa, H. Sawada, R. F. Astudillo, R. Orglmeister, S. Makino, "Recognition of Convolutive Speech Mixtures by Missing Feature Techniques for ICA", Proc. of Asilomar Conference on Signals, Systems, and Computers, pp 1397--1401, 2006.