General research interests

Dialogue Modelling for Statistical Machine Translation

In my postdoctoral research work, I am investigating how to improve machine translation technology in conversational domains. Machine translation, known by the general public through applications such as Google Translate, is the automatic translation from one language to another through a computer algorithm - for instance, translating from Japanese to Norwegian or vice-versa. Albeit great progress has been made over the last decade (mainly due to the development of robust statistical techniques), machine translation technology remains often poor at adapting its translations to the relevant context. In order to translate a dialogue (say, film subtitles from English to Norwegian), current translation systems typically operate one utterance at a time and ignore the global coherence and structure of the conversation.

speech-to-speech translator

My postdoctoral project aims to make machine translation systems more “context-aware”. The project will develop new translation methods that can dynamically adapt their outputs according to the surrounding dialogue context. More specifically, the project will demonstrate how to automatically extract contextual factors from dialogues and integrate these factors into a state-of-the-art statistical machine translation system. The main goal of the project is to show that this context-rich approach is able to produce translations of a higher quality than standard methods. In particular, the project will examine how these new translation methods can be practically employed to produce high-quality translations of film subtitles. Although the project will only conduct experiments with a limited set of languages (such as English-Norwegian), the translation techniques developed through the project are meant to be language-independent and could in principle be applied to any language pair. In the longer term, speech-to-speech interpretation (the task of automatically translating speech from one language to another, in real-time) is another possible application of the project.

Spoken Dialogue Systems

I am also interested in the development of efficient algorithms for spoken dialogue systems. The long-term aim is to build artificial agents which are able to interact with humans via natural language (e.g. spontaneous speech) to perform various tasks. Spoken dialogue systems are expected to play an ever-increasing role in our daily interactions with technology. Wouldn’t it be great if we could simply talk to our technological devices instead of having to tediously configure or program them?

Given the inherent complexity of natural language (and the uncertainty associated with speech recognition errors), building such kind of systems is a non-trivial engineering task. The diagram below illustrates a a simplified architecture schema. My Ph.D research concentrated more specifically on decision-making algorithms for dialogue management in rich, open-ended domains. The goal was to provide a new hybrid approach to dialogue management which combines rich linguistic knowledge (pragmatic rules, models of dialogue structure) with probabilistic models for planning and learning under uncertainty into a single, unified framework.

Generic architecture of a spoken dialogue system
Generic architecture of a spoken dialogue system

Nao robot

On the practical side, I am interested in the application of these ideas for intelligent user interfaces and human-robot interaction. For instance, tutoring systems to learn foreign languages, “social” robots capable of taking care of routine tasks in homes, offices, schools or hospitals, or cognitive assistants for disabled, mentally impaired or elderly persons.

In many of these applications, the dialogue system must operate in a rich, dynamic environment which needs to be properly captured. The agent must therefore relate - or ground - the interaction to an active understanding of its environment and what needs to be done in it through goals and plans of actions. And since the real world is also highly dynamic in nature (things are constantly changing), it must also be able to quickly adapt its behaviour relative to the surrounding context and the intentional, attentional and affective state of its conversational partners. In practice, it is almost impossible for the system developer to encode such complex interaction domains entirely by hand, and we are therefore using machine learning techniques to automatically learn and improve the system’s internal models based on prior experience.

My PhD research culminated in the release of the OpenDial toolkit that allows system developers to construct and evaluate dialogue systems with a new modelling framework based on probabilistic rules. The approach has been validated in several experiments conducted in the human-robot interaction domain, using a Nao robot as experimental platform. Feel free to have a look at my PhD thesis for more information about the approach, implementation and empirical validation.