An online user study investigating interactions between auditory and visual user interface feedback.

T. Lashina

Media Interaction , Philips Research, Eindhoven, The Netherlands

This paper presents an online user survey technique that was used to examine the interaction between auditory and visual modalities in providing navigational cues for a user of a multimodal interface. The online application used in the study went beyond a standard online questionnaire. It included video scenarios that demonstrated interaction with certain functionality of the user interface, where the sound in question appeared as an auditory interface feedback. The study investigated the following question: how does the visual interface representation influence the semantic interpretation of the interface feedback conveyed via the auditory modality? The necessity of providing the user with navigational cues is especially important when the interface is intended to be used in hands-and-eyes free situations, with speech control. In this case, it is crucial to give the user sufficient feedback for navigating the interface.

Before tackling the area of hands-and-eyes free usage, however, we set up an experiment addressing the audio-visual interaction in a hands-and-eyes busy situation. The Internet Audio Jukebox multimodal application was used as a platform for conducting the experiment. This application gives access to large collections of musical content via a number of interaction techniques, such as voice commands, touch input and query-by-humming. The Jukebox provides the user with visual and auditory non-speech feedback and can be operated hands-and-eyes free. The feedback sounds used in the Jukebox are melodic 'earcons', each conveying a certain message to the user.

In total, six Jukebox sounds were evaluated in the experiment, and a video scenario was made for each sound. The sounds were evaluated by the respondents in three experimental conditions: together with the 'correct' scenario, together with the 'wrong' scenario, and with just the sounds on their own. In each experimental condition, the subjects were asked two open questions: one addressing the meaning of the sound, and another addressing the appropriateness of the sound to convey this meaning. To prevent order effects in the first two visual conditions, the 'correct' and 'wrong' contexts were counterbalanced between the subjects. Consequently, half of the respondents received scenarios with sounds in the 'correct' visual condition first, while the other half received the 'wrong' visual condition first.

The results of the study indicate that the visual interface representation had a strong impact on the interpretation of the message conveyed by the interface via the auditory modality. In spite of this influence, the experimental sounds were judged by the subjects as significantly more appropriate to the visual interface representation for which they were intended, rather then for the 'wrong' visual representation. The experiment also demonstrated a higher recognition rate of the sounds in combination with the visual, rather than in that without it.


Paper presented at Measuring Behavior 2002, 4th International Conference on Methods and Techniques in Behavioral Research, 27-30 August 2002, Amsterdam, The Netherlands

© 2002 Noldus Information Technology bv