A Generic Multimodal Architecture for Integrating Voice and Ink XML Formats
Zouheir Trabelsi
College of Telecommunication, The University of Tunisia, Tunisia
Abstract: The acceptance of a standard VoiceXML format has facilitated the development of voice applications, and we anticipate a similar facilitation of pen application development upon the acceptance of a standard InkXML format. In this paper we present a multimodal interface architecture that combines standardized voice and ink formats to facilitate the creation of robust and efficient multimodal systems, particularly for noisy mobile environments. The platform provides a Web interactive system for generic multimodal application development. By providing mutual disambiguation of input signals and superior error handling this architecture should broaden the spectrum of users to the general population, including permanently and temporarily disabled users. Integration of VoiceXML and InkXML provides a standard data format to facilitate Web based development and content delivery. Diverse applications ranging from complex data entry and text editing applications to Web transactions can be implemented on this system, and we present a prototype platform and sample dialogues.
Keywords: Multimodal voice/ink applications, speech recognition, online handwriting recognition, mutual disambiguation, VoiceXML, InkXML.
Received March 13, 2003; accepted August 17, 2003