1 Area Background

Collaborative work is a hallmark of human achievement. As societies become global, knowledge-work conducted by geographically separated collaborators can be supported by the technologies of networked computing and communications. The machine system now becomes a "mediator" in human cooperation, and it has the opportunity to enhance human intellect and expand the ability for collaborative work.

For information exchange and joint decision-making, humans typically depend upon the dimensions of sight, sound and touch -- used simultaneously, and in combination. Emulation of this natural "multimodal" communication promises comfort and ease-of-use in collaborative systems. While component technologies for human-machine communication are, as yet, imperfect, they are sufficiently advanced that with engineering prudence and with intelligent software agents they can be employed to human benefit.

But, the design of a multimodal collaborative system also depends on other factors, encompassing the human user, the task domain, the environment and the intellectual context. To produce an optimally effective system, the total elements must be considered in an integrated design. The nature of these varied parameters suggests an interdisciplinary effort, combining cognitive and social sciences with computer science and engineering.

The purpose of this research, therefore, is to establish a new science of design and its methodology for engineering human-centered multimodal collaborative systems.

2 Project Overview

2.1 Motivation

Advances in networking and computing open new opportunities for collaborative work by geographically-separated participants. The challenge is to employ technology to extend human intellectual capabilities. Success depends upon natural, easy human/machine communication, and upon strategies that permit the machine to serve as a "value-added mediator." Technologies for human/machine communication, though imperfect as yet, have individually advanced sufficiently that they can serve simultaneous, multimodal communication in computer interfaces. The aim is to emulate features and advantages of multisensory human communication.

2.2 Objective

This research establishes, quantifies, and evaluates design methodologies for the synergistic combination of human/machine communication modalities in collaborative multiuser environments.

2.3 Method

This research creates a multiuser, collaborative environment with multimodal human/machine communication in the dimensions of sight, sound and touch. The network vehicle (called DISCIPLE, for Distributed System for Collaborative Information Processing and Learning) is an object-oriented groupware (presently evolving under DARPA sponsorship) running on the Internet TCP/IP as well as Asynchronous Transfer Mode (ATM) intracampus network.

At three user stations, CAIP-developed technologies for sight (eye-tracking, foveating sensing, image and face recognition), sound (automatic speech and speaker recognition, speech synthesis, distant-talking autodirective microphone arrays) and touch (gesture and position sensing, force-feedback gloves, and multitasking tactile software) are integrated into DISCIPLE for simultaneous multimodal use. The system so constituted provides a test bed for measuring benefits and synergies. With participation from cognitive science and human-factors engineering, a realistic application scenario is designed to evaluate combinations of modalities and to quantify performance.

Application scenarios that might be served by the system embrace activities as disparate as collaborative design, cooperative data analysis and manipulation, battlefield management, corporate decision making, and TELEMEDICINE. An initial experimental scenario is chosen to encompass ingredients of these collaborative tasks. The experimental scenario is based on the design, layout and equipment acquisition for a digital signal processor laboratory. Subjects are sets of three collaborators who are to share and work in the facility. Measurements of the time to achieve a satisfactory solution and the quality of the solution (as judged by a technical panel) quantify the utility of multimodal communication.

2.4 Significance

This research formulates and establishes methods for designing networked computer systems for multiuser collaborative tasks in which multimodal human/machine communication produces a demonstrable benefit. An additional impact of the research is the graduate training of four Ph.D. candidates in this newly-emerging field.

Back to CAIP Multimedia Page

This page last updated: Tue Jun 17 16:27:54 EDT 1997