How can we measure goodness of conversation with robots?

E.T. Harada1, S. Suto2 and M. Nambu3

1HOSEI University and 2CHUO University, Tokyo, Japan,
3
Future University-Hakodate, Hokkaido, Japan

Robots are now moving from factories to home and public spaces, and changing their functions from exact-and-fast manufacturing to communicating with people. As a communication agent, the problem how to evaluate these robots is becoming an important issue. In this study, we compared a human agent (a confederate) and three kinds of interface designs for an agent robot, who have roles in social settings: as an information desk, a clerk at a window, and a game partner. In the psychological experiment, participants were instructed to evaluate the partner of communication after a few minutes interaction in a given situation, and executed 6 trials, three of which were with a confederate and other with a robot. During the conversation, distances between the participant and the agent were measured on every stopping position of the participant. Three interface designs were:

  1. Typical robot-like planned utterances with a robot-like synthesized voice.
  2. Planned utterances with a human voice (recorded).
  3. Puman-like emergent utterances with human voice via speakers in a robot.

The planned utterances were controlled by an experimenter as a fake system, and in the emergent conversation conditions, utterances were restricted by a scenario, so that contents of conversation were kept equal within all 4 conditions, i.e. the 3 robots and the confederate condition.

Every indices showed the general preferences to a human agent against robots. Within 3 different interface designs, subjective evaluations and behaviors during conversation, including numbers of nodding and inter-agent distances, showed higher evaluation to emergent utterances beyond planned utterances. However, acceptance of gifts (candies) from agents showed advantages of human resemblances, that is, not only utterances but also voices had effects on following behaviors. Surprisingly, performances of memory tests indicated superiority of planned utterances with synthesized-voices, implying importance of integrity as a robot. In discussion, three factors of agent robot evaluations, i.e., emergent conversation, human resemblances, and design integrity and these relations will be examined.


Paper presented at Measuring Behavior 2005 , 5th International Conference on Methods and Techniques in Behavioral Research, 30 August - 2 September 2005, Wageningen, The Netherlands.

© 2005 Noldus Information Technology bv