Multimodal Discourse: Gesture, Speech and Gaze

Dr. Francis Quek

Associate Professor
Vision Interfaces and System Laboratory
Computer Science & Engineering Department
Wright State University

Monday, November 12th, 2:00 PM, ACES 3.408

quek@cs.wright.edu


Abstract

Human discourse is a dynamic process of converting thoughts into speech, gesture, and gaze activity. Grounded on the psycholinguistic foundations on the production of such multimodal ^Ñconversational-acts^Ò, we address the interpretation of gesture, speech, and gaze in the context of discourse management. We investigate the cues afforded by each mode of interaction and the algorithms necessary to detect and extract them; study the spatial and temporal relationships among these cues and associate them with topical units in discourse; study the interactions of gesture, speech and gaze in discourse segmentation; and a multimedia database system that integrates these elements into a coherent whole. Our approach involves experiments designed to discover and quantify cues in the various modalities, and their relation with respect to discourse management; the development of computational algorithms to detect and recognize such cues; and the integration of these cues into a cogent discourse management system.

We present psycholinguistic phenomena that are detected by our analysis. The understanding of how such phenomena are detectable from video and audio signal, and the determination of the kinds of computable cues that support such analysis are the first steps toward the bridging the signal-sense gap in multi-modal interaction. Cues for semantic discourse segmentation and organization include holds and handedness, hand symmetries, positional anchoring and deictic origos, cross-modal temporal integration and hold tension releases.

We have assembled a strong interdisciplinary team comprising psycholinguistic, machine vision and signal processing researchers to address the holistic nature of discourse and language itself. This permits us to base our research squarely on the realities of human communication in spontaneous discourse across a wide range of pragmatic conditions. Technology is being developed that have significant impact on natural language discourse analysis, human-computer interaction systems, neuropathological studies (Parkinson^Òs Disease and Left/Right Hemisphere Damage) and discourse and video databases.

Biography

Francis Quek is currently an Associate Professor in the Department of Computer Science and Engineering at the Wright State University. He has formerly been affiliated with the University of Illinois at Chicago, the University of Michigan Artificial Intelligence Laboratory, the Environmental Research Institute of Michigan (ERIM) and Hewlett-Packard Human Input Division. Francis received both his B.S.E. summa cum laude (1984) and M.S.E. (1984) in electrical engineering from the University of Michigan in two years. He completed his Ph.D. C.S.E. at the same university in 1990. He also has a Technician's Diploma in Electronics and Communications Engineering from the Singapore Polytechnic (1978), and briefly attended Oregon State University in 1982. Francis is a member of the IEEE and ACM.

He is director of the Vision Interfaces and Systems Laboratory (VISLab) which he established for computer vision, medical imaging, vision-based interaction, and human-computer interaction research. He performs research in multimodal verbal/non-verbal interaction, vision-based interaction, facial modeling, multimedia databases, medical imaging, collaboration technology, computer vision, human computer interaction, and computer graphics.

Francis is the Principal Investigator of several prestigious National Science Foundation grants in gesture, speech, and gaze research and of a Whitaker Foundation grant in neurovascular extraction in medical brain images. He leads a team of researchers in a multi-million dollar NSF-KDI project spanning multiple-disciplines, institutions, and countries to understand the communicative realities of multimodal interaction. He is also P.I. on an NSF/STIMULATE grant.


A list of Telecommunications and Signal Processing Seminars is available at from the ECE department Web pages under "Seminars". The Web address for the Telecommunications and Signal Processing Seminars is http://signal.ece.utexas.edu/seminars