Welcome to the ‘Digital Phonetics’ research group at IMS, Stuttgart University. The group is founded in 2015 and headed by Prof. Dr. Ngoc Thang Vu.
Our research interests cover various areas in computational linguistics, machine learning and human-machine interaction. These can be divided into the following four categories: Perception, Interaction, Learning and Reasoning.
Among multiple senses (e.g. hearing, vision and touch) in Perception, our research focuses on the sub-area of processing and extracting information from speech. We aim at improving robustness of speech processing systems, e.g. speech recognition, spoken language understanding, prosodic events detection, and speech emotion recognition, in real world conditions, e.g. in spontaneous conversations, in noisy environments, and in multilingual environments (e.g. Code-switching).
Interaction research focuses on methods, e.g. reinforcement learning, to teach systems which strategy to apply for intelligently reacting to user inputs (from both, lexical inputs and social signals), which linguistic realization should be used (even across languages) and how to utter it in order to increase human acceptance. One of the main objectives of this area is to make systems adaptive in supporting users to efficiently fulfill certain goals, while at the same time being friendly and likeable by using the right words and tones.
Learning research provides systems with the ability to continually learn during interaction with users. It explores techniques aiming at better learning algorithms including better parameter initialization and distance metrics from multiple learning tasks. Therefore, it allows faster fine-tuning and provides a framework for few-shot learning that is closer to human learning than the traditional supervised learning framework.
Reasoning research investigates methods which give systems the capability to provide explanation along with their decisions. We focus on identifying supporting facts rather than logical reasoning. Furthermore, we are also interested in methods, e.g. Gaussian processes, that allow systems to estimate and communicate reliable uncertainty values for their decisions.
Our research group follows ethical guidelines provided by different research organizations such as ISCA, ACL and IEEE, especially in the context of artificial intelligence (AI) systems.
- Data-integrated Simulation of Human Perception and Cognition, Exellenzclustter SimTech PN6 - Machine Learning for Simulation (2019 - 2022)
- Digital Phonetics (main focus: speech processing and dialog systems) funded by Carl-Zeiss-Stiftung (2018 - 2023)
- Spoken Language Understanding funded by Sony Europe, Stuttgart Technology Center (2018 - 2021)
- Methods for Explainability in Natural Language Understanding, in collaboration with Bosch AI (2020 - 2023)
- Language-Knowledge Interaction (responsibilities: AI Automation in Question Answering) funded by IBM (2020 - 2023)
1. Investigating the Interaction between Speech and Language Processing for Spoken Language Understanding: A Case Study for Sentiment Analysis (SFB 732 A8, 2016-2018)