Multilingual Conversational System Research

9807-11

Progress Report: July 1, 1998–December 31, 1998

James Glass and Stephanie Seneff

 

Project Overview:

The long-term goals of this research project are to foster collaboration between MIT and NTT speech and language researchers and to develop language-independent approaches to speech understanding and generation. We will initiate this effort by developing the necessary human language technologies that will enable us to port our conversational interfaces from English to Japanese. The Jupiter weather information system will be used as the basis of this porting process. This work will involve the close collaboration with NTT researchers both in Japan and at MIT.

Progress to Date:

Research in this area requires the expertise of a speech and language researcher fluent in both the source and the target language. Thus, significant progress could not be made on this project until a visiting scientist from NTT was established in our group. This past fall we made contact with Drs. Aikawa and Minami from NTT and made plans for Dr. Minami to join our group in January for a period of at least one year. In December, Drs. Tohkura, Aikawa, Minami, Zue, Seneff, and Glass met at an international conference in Sydney to discuss the project in more detail.

Dr. Minami arrived in our group in late January. For the past couple of weeks he has been getting settled in Boston and in our group. He is currently set up with a Sun Ultra running Solaris and has been working on setting up a computer environment, getting the NTT Japanese synthesizer running, and looking for detailed Japanese weather forecasts on the web, which will be the focus of the Japanese Jupiter domain.

Research Plan:

Our current research plan with Dr. Minami is to port the Jupiter system to Japanese in the following steps.1. Develop Japanese language generation capability for the Jupiter system. When connected with the NTT Japanese synthesizer, this will allow a user to speak in English, and have the system reply in Japanese. The current plan is to use our GENESIS language generation component for this work. GENESIS has been used previously for many different languages, including Japanese, Korean, and Mandarin Chinese, as well as several European languages.

2. Develop at least a rudimentary natural language capability using our TINA NL system. This can be aided by first translating many English Jupiter queries into Japanese, to help train and evaluation the NL system. This process will be aided by having Japanese language generation from step 1, as the developer will be able to parse a Japanese query, turn it into a meaning representation, and then paraphrase the resulting query in either Japanese or English.

3. Begin data collection. Once a rudimentary Japanese NL capability is available, it will be possible to begin wizard-based data collection for weather queries from native Japanese speakers. The human wizard would listen to each user query in real time, and either speak or type to the system an abbreviated query that retains the core meaning of the original query. These data will be used initially to expand the NL coverage for Japanese Jupiter, and later to train the recognizer.

4. Develop rudimentary speech recognition capability for the Jupiter domain. Both the existing NTT and MIT recognizers are being considered at this time. Using available telephone quality speech from Japanese speakers, or perhaps boot-strapping from English phonetic models, we can put together a Japanese recognizer for the Jupiter domain. One of the requirements will be data to train a language model, for which we propose to use the translated Jupiter queries, at least initially.

5. Once we have rudimentary speech recognition, we can begin to collect data from subjects talking to a complete system. These data are crucial for successive iterations to improve all aspects of the Japanese Jupiter system.

We believe we will be able to accomplish the first few steps by June, and be working on the speech recognition by this summer. In the longer term, once the system data-collection effort is under way, we can begin to address other issues such as language independent acoustic-phonetic processing, dialogue control, and speech understanding. We can use the Japanese Jupiter corpus and data collection environment to evaluate research in these areas.

In parallel with this series of five steps, we will also explore Japanese language content processing, in order to improve the quality and scope of the information the system can deliver concerning weather in Japan. For example, we will explore the feasibility of parsing weather reports available in Japanese from Web sites maintained in Japan, and incorporating the results into our weather database. The result will be improved weather information for both our English-based and Japanese-based systems.